Skip to content

feat: add auto-resolution and name remapping for S3 datastore extension#910

Merged
stack72 merged 1 commit intomainfrom
s3-datastore-extension
Mar 30, 2026
Merged

feat: add auto-resolution and name remapping for S3 datastore extension#910
stack72 merged 1 commit intomainfrom
s3-datastore-extension

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented Mar 29, 2026

Summary

This is the first step toward moving the S3 datastore to the @swamp/s3-datastore extension package (published as v2026.03.28.1 in the swamp-extensions repo). A follow-up PR will remove the built-in S3 code once this is validated.

Phase 1: Auto-resolution infrastructure

  • resolveDatastoreType() helper in extension_auto_resolver.ts
  • hotLoadDatastores() in ExtensionInstallerPort and auto-resolver adapter
  • Datastores are now hot-loaded alongside models and vaults after extension install

Phase 2: Name remapping

  • RENAMED_DATASTORE_TYPES map: "s3" → "@swamp/s3-datastore"
  • resolveDatastoreConfig and parseDatastoreEnvVar are now async
  • Old type: s3 configs auto-resolve the extension, with deprecation warning
  • Falls back to built-in S3 if extension unavailable

User impact

  • No breaking changes — existing type: s3 configs work as before
  • Deprecation warning logged when old s3 type name is used
  • First use auto-installs extension (~5s one-time); subsequent runs use cache
  • Requires @swamp/s3-datastore extension published to registry (done: v2026.03.28.1)

Test plan

  • Compiled binary, tested against real S3-backed repository
  • First run: extension auto-resolved, downloaded, bundled, cached
  • Second run: loaded from cache, no registry call, completed cleanly
  • All unit tests pass (38 tests across modified files)
  • CI passes

🤖 Generated with Claude Code

@stack72 stack72 marked this pull request as ready for review March 30, 2026 15:08
@stack72 stack72 force-pushed the s3-datastore-extension branch from e0a94d9 to 7c25795 Compare March 30, 2026 15:12
github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

Add infrastructure to automatically resolve the S3 datastore from the
@swamp/s3-datastore extension package, replacing the built-in S3 code
path over time.

## What changed

### Phase 1: Auto-resolution infrastructure
- Added `resolveDatastoreType()` helper to `extension_auto_resolver.ts`
  (mirrors the existing `resolveVaultType()` pattern)
- Added `hotLoadDatastores()` to `ExtensionInstallerPort` interface and
  the auto-resolver adapter, so newly installed datastore extensions are
  hot-loaded alongside models and vaults
- Updated `installAndLoad()` to call `hotLoadDatastores()` after pulling
  an extension

### Phase 2: Name remapping with backward compatibility
- Added `RENAMED_DATASTORE_TYPES` map: `"s3" → "@swamp/s3-datastore"`
- Made `resolveDatastoreConfig` and `parseDatastoreEnvVar` async to
  support auto-resolution
- When `type: s3` is encountered in `.swamp.yaml` or `SWAMP_DATASTORE`
  env var, the code remaps to `@swamp/s3-datastore`, attempts
  auto-resolution, and falls back to built-in S3 handling if the
  extension is unavailable
- Users see a deprecation warning suggesting they update their config

## User impact

- **No breaking changes** — existing `type: s3` configs continue to work
  via the built-in fallback
- A deprecation warning is logged when the old `s3` type name is used
- On first use, logged-in users will see the extension auto-install
  (~5s one-time cost); subsequent runs use the cached extension
- The `@swamp/s3-datastore` extension must be published to the registry
  before this lands (it has been published as v2026.03.28.1)

## Testing performed

- Compiled binary and tested against a real S3-backed repository
- First run: extension auto-resolved from registry, downloaded, bundled,
  and cached successfully
- Second run: extension loaded from cache with no registry call, command
  completed cleanly
- All unit tests pass (38 tests across modified files)

Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
@stack72 stack72 force-pushed the s3-datastore-extension branch from 7c25795 to b44be45 Compare March 30, 2026 15:23
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Clean implementation that mirrors the existing vault auto-resolution pattern for datastores. The approach is sound: domain-level resolveDatastoreType() helper, port extension on ExtensionInstallerPort, adapter wiring, and a backwards-compatible rename+fallback strategy in the CLI layer.

Blocking Issues

None.

Suggestions

  1. Test naming convention in extension_auto_resolver_test.ts: The three new resolveDatastoreType tests use - as a separator (e.g., "resolveDatastoreType - returns true..."), while CLAUDE.md convention is : (e.g., "resolveDatastoreType: returns true..."). The resolve_datastore_test.ts file was correctly updated to use : — worth aligning the auto-resolver tests too.

  2. Duplicated S3 remapping logic: The S3-specific config construction + schema validation + provider creation + fallback flow is largely duplicated between parseDatastoreEnvVar (~lines 102-145) and resolveDatastoreConfig (~lines 244-291). Both paths build a config object from S3 fields, validate it, create a provider, resolve paths, and fall back to built-in S3. Consider extracting a shared helper like tryResolveRenamedS3(bucket, prefix, opts) to reduce the duplication surface — this would also make it easier to add future renamed types.

  3. Minor: dsType = ds.type on line 290 (the fallback after extension is unavailable) resets to the original type, which is correct but reads oddly since dsType was already set to ds.type at declaration. A comment like // Extension unavailable, continue with original type would clarify intent.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found. The async conversion is complete across all callers, the fallback logic is sound, and the auto-resolution plumbing mirrors the established vault pattern correctly.

Medium

  1. src/cli/resolve_datastore.ts:106,144 — Empty-string prefix inconsistency between extension and built-in paths. When input is s3:my-bucket/ (trailing slash, no prefix text), value.slice(slashIdx + 1) produces "". On the extension path (line 114), if (prefix) is falsy so prefix is omitted from the config object. On the built-in fallback path (line 144), prefix is passed as "" directly into the return value. This means the same input produces { config: { bucket: "my-bucket" } } via extension vs { prefix: "" } via built-in fallback — a semantic divergence. Breaking example: Extension provider treats missing prefix as "use root", built-in consumer treats "" as a set-but-empty prefix with subtly different behavior (e.g., concatenating "/" + "" producing a trailing slash in S3 keys). Suggested fix: Normalize empty prefix to undefined before the branch: const prefix = slashIdx === -1 || slashIdx === value.length - 1 ? undefined : value.slice(slashIdx + 1);

  2. src/cli/resolve_datastore.ts:147 — Future renamed non-s3 types will attempt to parse value as JSON with the raw shorthand syntax. If a second entry is added to RENAMED_DATASTORE_TYPES (e.g., "gcs": "@swamp/gcs-datastore"), and users pass gcs:my-bucket, the code skips the if (type === "s3") block, sets type = renamedTo, and falls through to the JSON parsing path (line 175) where JSON.parse("my-bucket") throws a confusing Invalid JSON config error rather than a helpful migration message. This is currently inert since only "s3" is mapped, but the structure invites the bug. Suggested fix: Add a comment noting that any future entry in RENAMED_DATASTORE_TYPES must also add a shorthand-parsing branch analogous to the if (type === "s3") block, or refactor the shorthand parsing to be data-driven.

Low

  1. src/domain/extensions/extension_auto_resolver.ts:318-320hotLoadDatastores() errors are unhandled. If hotLoadDatastores() throws (e.g., corrupt extension bundle, permission error on the datastores dir), the error propagates up through installAndLoaddoResolve where it's caught by the generic catch block (line 197). For network errors, it shows a "network error" message; for others, it shows "not found" — neither accurately describes a datastore loading failure. This matches the existing behavior for hotLoadVaults() (same pattern), so it's not a regression, but worth noting.

  2. src/cli/resolve_datastore_test.ts:113,121,129assertRejects with a non-async callback. The callbacks () => parseDatastoreEnvVar(...) return a Promise but are not marked async. assertRejects handles this correctly (it awaits the return value), so this works, but it's inconsistent with the other test patterns that use async () => await .... Purely cosmetic.

Verdict

PASS — The PR is a clean, well-structured implementation that correctly mirrors the vault auto-resolution pattern for datastores. The sync-to-async migration is complete across all call sites, fallback logic is sound, and tests cover the key paths. The medium findings are edge cases that don't affect current functionality.

@stack72 stack72 merged commit b9db08e into main Mar 30, 2026
10 checks passed
@stack72 stack72 deleted the s3-datastore-extension branch March 30, 2026 15:31
stack72 added a commit that referenced this pull request Mar 30, 2026
…tension

Remove all built-in S3 datastore code from the core binary and delegate
S3 storage entirely to the `@swamp/s3-datastore` extension. This
follows the same pattern as PR #736 which moved vault providers to
extensions.

## What changed

### Built-in S3 code removed
- Deleted 6 infrastructure files: `s3_client.ts`, `s3_lock.ts`,
  `s3_cache_sync.ts`, `s3_datastore_verifier.ts`, and their tests
- Removed `S3DatastoreConfig`, `S3ConnectionConfig` from the type system
- Removed `@aws-sdk/client-s3` from `deno.json` (no longer bundled)
- Removed all `config.type === "s3"` branches from `repo_context.ts`,
  `resolve_datastore.ts`, `setup.ts`, `sync.ts`, `status.ts`
- Removed `swamp datastore setup s3` CLI command

### Generic extension setup command added
- New `swamp datastore setup extension <type> --config '{...}'` command
  works with any extension datastore, not just S3
- Validates config against extension schema, verifies health via
  provider, migrates data via sync service, updates `.swamp.yaml`
- Supports legacy type names: `swamp datastore setup extension s3 ...`
  auto-remaps to `@swamp/s3-datastore`

### Backward compatibility preserved
- `type: s3` in `.swamp.yaml` auto-remaps to `@swamp/s3-datastore`
  with deprecation warning (from Phase 1+2 in PR #910)
- `SWAMP_DATASTORE=s3:bucket/prefix` env var format still works
- Extension auto-resolves on first use for logged-in users

### Docs and skills updated
- `design/datastores.md`: S3 now documented as extension, not built-in
- `README.md`: Updated setup examples to use extension command
- `.claude/skills/swamp-repo/SKILL.md`: Updated command reference

## Design choices

### Why the provider pattern is better than built-in
The old code created `S3Client` at 6+ call sites in `repo_context.ts`,
manually destructuring config fields each time. PR #917 found that
`endpoint` and `forcePathStyle` were silently dropped at several sites,
breaking S3-compatible stores (DigitalOcean Spaces, MinIO). The
extension's provider pattern stores config once at construction and
passes it through everywhere — structurally immune to this class of bug.

### Sync behavioral differences
The built-in S3 path used `pullAll()`/`pushAll()` for sync operations.
The extension path uses `pullChanged()`/`pushChanged()` via the
`DatastoreSyncService` interface. The "changed" variants are better for
normal use (only transfer what's needed). For empty caches, they're
equivalent since everything is "changed". The built-in also had
`pullChangedForModel()` for per-model scoped pulls — an optimization
not available through the generic interface, acceptable trade-off for
now.

### `datastore status` no longer shows S3-specific fields
Bucket, prefix, region, endpoint are no longer displayed in
`swamp datastore status`. The generic extension interface doesn't
expose backend-specific details. The health check still works via the
provider's `createVerifier()`.

### Known limitation: offline + legacy S3 config
If a user has `type: s3` in `.swamp.yaml`, is offline, and the extension
isn't cached, `resolveDatastoreConfig` throws before any command runs.
The error message is clear ("Install it with: swamp extension pull
@swamp/s3-datastore"). This is an edge case — the extension is cached
after first use.

## Test plan

- [x] `deno check` — clean
- [x] `deno lint` — 910 files, clean
- [x] `deno fmt` — 998 files, clean
- [x] `deno run test` — 3672 passed, 0 failed
- [x] `deno run compile` — binary compiles without `@aws-sdk/client-s3`
- [x] Manual test: compiled binary against real S3-backed repo, extension
      auto-resolved and command completed cleanly
- [ ] CI passes

Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant