feat: remove built-in S3 datastore in favor of @swamp/s3-datastore extension#930
feat: remove built-in S3 datastore in favor of @swamp/s3-datastore extension#930
Conversation
…tension Remove all built-in S3 datastore code from the core binary and delegate S3 storage entirely to the `@swamp/s3-datastore` extension. This follows the same pattern as PR #736 which moved vault providers to extensions. ## What changed ### Built-in S3 code removed - Deleted 6 infrastructure files: `s3_client.ts`, `s3_lock.ts`, `s3_cache_sync.ts`, `s3_datastore_verifier.ts`, and their tests - Removed `S3DatastoreConfig`, `S3ConnectionConfig` from the type system - Removed `@aws-sdk/client-s3` from `deno.json` (no longer bundled) - Removed all `config.type === "s3"` branches from `repo_context.ts`, `resolve_datastore.ts`, `setup.ts`, `sync.ts`, `status.ts` - Removed `swamp datastore setup s3` CLI command ### Generic extension setup command added - New `swamp datastore setup extension <type> --config '{...}'` command works with any extension datastore, not just S3 - Validates config against extension schema, verifies health via provider, migrates data via sync service, updates `.swamp.yaml` - Supports legacy type names: `swamp datastore setup extension s3 ...` auto-remaps to `@swamp/s3-datastore` ### Backward compatibility preserved - `type: s3` in `.swamp.yaml` auto-remaps to `@swamp/s3-datastore` with deprecation warning (from Phase 1+2 in PR #910) - `SWAMP_DATASTORE=s3:bucket/prefix` env var format still works - Extension auto-resolves on first use for logged-in users ### Docs and skills updated - `design/datastores.md`: S3 now documented as extension, not built-in - `README.md`: Updated setup examples to use extension command - `.claude/skills/swamp-repo/SKILL.md`: Updated command reference ## Design choices ### Why the provider pattern is better than built-in The old code created `S3Client` at 6+ call sites in `repo_context.ts`, manually destructuring config fields each time. PR #917 found that `endpoint` and `forcePathStyle` were silently dropped at several sites, breaking S3-compatible stores (DigitalOcean Spaces, MinIO). The extension's provider pattern stores config once at construction and passes it through everywhere — structurally immune to this class of bug. ### Sync behavioral differences The built-in S3 path used `pullAll()`/`pushAll()` for sync operations. The extension path uses `pullChanged()`/`pushChanged()` via the `DatastoreSyncService` interface. The "changed" variants are better for normal use (only transfer what's needed). For empty caches, they're equivalent since everything is "changed". The built-in also had `pullChangedForModel()` for per-model scoped pulls — an optimization not available through the generic interface, acceptable trade-off for now. ### `datastore status` no longer shows S3-specific fields Bucket, prefix, region, endpoint are no longer displayed in `swamp datastore status`. The generic extension interface doesn't expose backend-specific details. The health check still works via the provider's `createVerifier()`. ### Known limitation: offline + legacy S3 config If a user has `type: s3` in `.swamp.yaml`, is offline, and the extension isn't cached, `resolveDatastoreConfig` throws before any command runs. The error message is clear ("Install it with: swamp extension pull @swamp/s3-datastore"). This is an edge case — the extension is cached after first use. ## Test plan - [x] `deno check` — clean - [x] `deno lint` — 910 files, clean - [x] `deno fmt` — 998 files, clean - [x] `deno run test` — 3672 passed, 0 failed - [x] `deno run compile` — binary compiles without `@aws-sdk/client-s3` - [x] Manual test: compiled binary against real S3-backed repo, extension auto-resolved and command completed cleanly - [ ] CI passes Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
There was a problem hiding this comment.
CLI UX Review
Blocking
None.
Suggestions
-
Removed
setup s3subcommand leaves no breadcrumb — Users withswamp datastore setup s3muscle memory will get a generic Cliffy "Unknown subcommand" error with no guidance toward the new command. A suggestion in the error or in thedatastore setup --helpepilog pointing tosetup extension @swamp/s3-datastorewould ease the migration. (Minor — existing.swamp.yamlconfigs still work transparently.) -
--configflag description doesn't hint at structure —"JSON configuration for the datastore extension"\ is correct but gives no guidance. Something like"JSON config object for the extension (e.g., '{"bucket":"name","region":"us-east-1"}')"would make tab-complete /--help` output more self-explanatory for first-time users. -
datastore statusno longer surfaces backend details — After setup, users runningswamp datastore statusseeType: @swamp/s3-datastoreand a health check but no bucket/prefix/region. This regression is noted in the PR as intentional. It's fine for now, but worth tracking — operators debugging a misconfigured remote often want to confirm which bucket they're actually hitting.
Verdict
PASS — No blocking UX issues. Command, error messages, and log/JSON output modes are all consistent. The s3→extension migration is backward-compatible for existing configs, and error messages for missing extensions and bad JSON are clear and actionable.
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
- Comment references to S3CacheSyncService — Two files still reference
S3CacheSyncServicein comments (src/domain/datastore/datastore_sync_service.ts:23andsrc/infrastructure/persistence/datastore_sync_coordinator.ts:41). These are minor but could be updated to reference the genericDatastoreSyncServiceinterface instead, since the concrete class no longer exists.
Notes
- Import boundary: CLI commands correctly import libswamp types from
src/libswamp/mod.ts. Direct domain imports (datastoreTypeRegistry,resolveDatastoreType, etc.) from CLI code follow the pre-existing pattern used in other CLI files. - Type system: The
DatastoreConfigunion simplification (FilesystemDatastoreConfig | CustomDatastoreConfig) is clean.isCustomDatastoreConfigcorrectly checks onlyconfig.type !== "filesystem"now. - No dangling imports: Verified no remaining imports of deleted files (
s3_client.ts,s3_lock.ts,s3_cache_sync.ts,s3_datastore_verifier.ts). - Test coverage: 6 new tests for
datastoreSetupExtensioncover valid config, skip migration, unregistered type, invalid config schema, unhealthy backend, and non-upgraded repo. Existing tests updated appropriately (S3 tests now expectUserErrorwhen extension is missing). - DDD alignment: The extension provider pattern (
DatastoreProvider.createLock(),createVerifier(),createSyncService()) properly delegates infrastructure concerns to the extension, keeping the domain clean. ThedatastoreSetupExtensiongenerator acts as an application service orchestrating the setup workflow. - Security: JSON config parsing uses
JSON.parsein a try/catch with aUserError— no injection risk. Path traversal protection was in the deletedS3CacheSyncServiceand is now the extension's responsibility (appropriate boundary shift). - Backward compatibility: Legacy
type: s3configs auto-remap with clear error messages when the extension is missing. TheRENAMED_DATASTORE_TYPESmap handles the transition.
Clean removal of built-in S3 code in favor of the extension pattern. The changes are consistent and well-tested.
There was a problem hiding this comment.
Adversarial Review
Critical / High
No critical or high severity issues found.
Medium
-
src/cli/commands/datastore_setup.ts:113-117— Extension auto-resolution can fail silently before registry checkThe
resolveDatastoreTypecall at line 116 isawaited but if it throws (e.g., network error during auto-resolution), the error propagates as an unhandled exception with a raw error message rather than a user-friendlyUserError. The subsequentdatastoreTypeRegistry.has()check at line 119 would handle the "extension not found" case gracefully, but a transient network failure during resolution would surface as an opaque stack trace.Example: User runs
swamp datastore setup extension @swamp/s3-datastore --config '{}'while offline for the first time (extension not cached).resolveDatastoreTypethrows a network error, user sees a raw stack trace instead of the friendly "Install it with: swamp extension pull ..." message.Suggested fix: Wrap the
resolveDatastoreTypecall in try/catch, letting it fall through to the!datastoreTypeRegistry.has()check which already has a clean error message. -
src/libswamp/datastores/setup.ts:302-306—resolveCachePathfallback uses"unknown"whenrepoIdis absentWhen
input.repoIdis undefined (which can happen — see the??fallback), the cache path becomes.swamp/repos/unknown. If two repos both lack IDs and set up extension datastores, they'll share the same cache directory, causing data cross-contamination during migration.Example: Two repos without repoId (e.g., during init race condition or corrupted marker) → both get cache at
~/.swamp/repos/unknown/→ migration from repo A overwrites repo B's cache.Suggested fix: This is a pre-existing pattern (the old S3 code had the same fallback), so it's not a regression. But since you're rewriting the setup flow, consider generating a random fallback ID or erroring when
repoIdis missing. -
src/libswamp/datastores/setup.ts:292-296—updateRepoConfigis called before migration, but migration errors don't roll back the configIf
updateRepoConfigsucceeds (writes extension type to.swamp.yaml) but the subsequent migration or push fails, the repo is left with a.swamp.yamlpointing to the extension datastore but with data still only in the old location. Theerrorsarray captures push failures but the "completed" event is still emitted.Example:
pushChanged()throws. User sees a "completed" event with errors..swamp.yamlnow saystype: @swamp/s3-datastorebut data never made it to S3. Next command tries to pull from an empty remote store.Mitigation: This was also the behavior in the old S3 setup code (config was written before push), so it's not a regression. The errors array does communicate the failure. But worth noting the user could be confused by "completed with errors."
Low
-
src/cli/commands/datastore_setup.ts:128-129—JSON.parsewith user-supplied--configcould throw on non-string values
Theas Record<string, unknown>cast afterJSON.parsedoesn't validate the parsed value is actually an object.JSON.parse('"hello"')returns a string,JSON.parse('42')returns a number. Both would pass the parse but fail later when the provider tries to use the config as a Record.Not a security issue — Cliffy guarantees
options.configis a string — butJSON.parse("true")would produce a boolean that silently passes through until the provider blows up with an opaque error.Suggested fix: Add
typeof config !== "object" || config === null || Array.isArray(config)check after parse. -
Removed
already_existscheck for S3 datastores — The olddatastoreSetupS3had acheckS3DatastoreExistsguard that prevented accidentally pointing a new repo at an existing S3 datastore. The newdatastoreSetupExtensionhas no equivalent. If an extension provider doesn't implement its own "already exists" check, users can accidentally clobber remote data during setup migration. This is a behavioral change, not a bug — the extension is now responsible for this check.
Verdict
PASS — Clean removal of built-in S3 code in favor of the extension pattern. No dangling imports, no stale type references, no broken code paths. The isCustomDatastoreConfig type guard correctly updated to treat type !== "filesystem" as custom. The DatastoreConfig union properly simplified. All callers consistently route through extension provider pattern. The medium findings are pre-existing patterns or edge cases, not regressions.
Summary
@swamp/s3-datastoreextension (published as v2026.03.28.1)swamp datastore setup extension <type> --config '{...}'command@aws-sdk/client-s3from the binary (smaller binary, fewer transitive deps)Follows the same pattern as #736 (vault providers → extensions). Builds on #910 which added auto-resolution and name remapping.
What changed
Deleted (6 infrastructure files + tests)
s3_client.ts,s3_client_test.ts,s3_lock.ts,s3_lock_test.ts,s3_cache_sync.ts,s3_datastore_verifier.tsType system
S3DatastoreConfig,S3ConnectionConfigfromdatastore_config.tsDatastoreConfigunion is nowFilesystemDatastoreConfig | CustomDatastoreConfigisCustomDatastoreConfigchecksconfig.type !== "filesystem"(S3 is now custom)BUILT_IN_DATASTORE_TYPES(onlyfilesystemremains)CLI
swamp datastore setup s3commandswamp datastore setup extension <type> --config '{...}'— works with any extension datastoreswamp datastore setup extension s3 --config '...'→@swamp/s3-datastoreRuntime
config.type === "s3"branches fromrepo_context.ts,sync.ts,status.ts,setup.tsisCustomDatastoreConfig→ extension provider handles lock, sync, healthDependencies
@aws-sdk/client-s3fromdeno.jsonDesign choices
Why the provider pattern is structurally better
The old code created
S3Clientat 6+ call sites, manually destructuring config fields. #917 foundendpointandforcePathStylewere silently dropped at several sites, breaking S3-compatible stores. The extension stores config once at construction — this class of bug can't recur.Sync behavioral differences (intentional)
Built-in used
pullAll()/pushAll(). Extension usespullChanged()/pushChanged()via theDatastoreSyncServiceinterface. The "changed" variants are better for normal use (only transfer what's needed) and equivalent for empty caches. The built-in also hadpullChangedForModel()for per-model scoped pulls — an optimization not available through the generic interface. Acceptable trade-off; can be added later if large datastores need it.datastore statusno longer shows S3-specific fieldsBucket/prefix/region/endpoint no longer displayed. The generic interface doesn't expose backend-specific details. Health checks still work via
createVerifier().Known limitation: offline + legacy config
If
type: s3in.swamp.yaml, user is offline, and extension isn't cached →resolveDatastoreConfigthrows. Error message is clear: "Install it with:swamp extension pull @swamp/s3-datastore". Edge case — extension is cached after first use.User impact
type: s3configs keep working — auto-remapped with deprecation warningswamp datastore setup s3replaced byswamp datastore setup extension @swamp/s3-datastore --config '{...}'Test plan
deno check— cleandeno lint— 910 files, cleandeno fmt— 998 files, cleandeno run test— 3672 passed, 0 faileddeno run compile— binary compiles without@aws-sdk/client-s3datastoreSetupExtensiongenerator🤖 Generated with Claude Code