Skip to content

fix: resolve bundle paths through DatastorePathResolver for S3 datastores#1103

Merged
stack72 merged 1 commit intomainfrom
fix/1100-s3-datastore-bundle-path-resolution
Apr 4, 2026
Merged

fix: resolve bundle paths through DatastorePathResolver for S3 datastores#1103
stack72 merged 1 commit intomainfrom
fix/1100-s3-datastore-bundle-path-resolution

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented Apr 4, 2026

Summary

  • Fix bundle path resolution failure after migrating from filesystem to S3 datastore
  • Wire DatastorePathResolver into 4 extension bundle loaders (model, vault, driver, report) so bundle reads/writes route through the configured datastore tier
  • Add catalog invalidation on datastore migration to prevent stale bundle paths

Fixes #1100

Root Cause

All 5 extension bundle loaders (UserModelLoader, UserVaultLoader, UserDriverLoader, UserDatastoreLoader, UserReportLoader) hardcoded bundle paths to local .swamp/bundles/ via join(repoDir, SWAMP_DATA_DIR, SWAMP_SUBDIRS.bundles, ...). The DatastorePathResolver — which already correctly routed data artifacts, workflow runs, secrets, and other datastore-tier paths — was never consulted for bundles.

After S3 datastore migration, bundles existed only in the cache path (~/.swamp/repos/<repo-id>/bundles/) but the loaders still looked in .swamp/bundles/, producing No such file or directory (os error 2).

What Changed

1. DEFAULT_DATASTORE_SUBDIRS (datastore_config.ts)

Added driver-bundles, datastore-bundles, and report-bundles to the default datastore subdirectories list. Without this, the resolver would treat these 3 bundle types as local-only, even when an S3 datastore is configured. (bundles and vault-bundles were already listed.)

2. Bundle Loaders — 4 files

Added optional DatastorePathResolver constructor parameter and a private resolveBundlePath() helper to each loader. The helper delegates to resolver.resolvePath() when available, falling back to the hardcoded local path when absent (backward compatibility for tests, repo init, no datastore configured).

Replaced all hardcoded join(repoDir, SWAMP_DATA_DIR, SWAMP_SUBDIRS.*, ...) bundle path constructions with resolveBundlePath():

  • UserModelLoader: getBundlePath(), bundleWithCache(), importBundle(), populateCatalogFromRegistry(), assertSafePath boundary. Excluded migrateOldFlatBundles() which must always read from the old local path.
  • UserVaultLoader: bundleWithCache(), importBundle(), assertSafePath boundary
  • UserDriverLoader: bundleWithCache(), importBundle(), assertSafePath boundary
  • UserReportLoader: bundleWithCache(), importBundle(), assertSafePath boundary

UserDatastoreLoader is explicitly excluded — it loads datastore extensions that configure the resolver (bootstrap ordering). It always uses the local .swamp/ path.

3. CLI Wiring (mod.ts, auto_resolver_adapters.ts)

  • Created a lazy resolver factory in mod.ts that constructs the DefaultDatastorePathResolver on first loader invocation and caches it for reuse
  • Passed the factory to model, vault, driver, and report loaders via setLoader() closures
  • Datastore loader explicitly excluded with a comment explaining why
  • Updated auto_resolver_adapters.ts to accept an optional resolver for hot-load scenarios

4. Catalog Invalidation (extension_catalog_store.ts, user_model_loader.ts)

The ExtensionCatalogStore (SQLite at .swamp/_extension_catalog.db) stores absolute bundle_path values. After datastore migration, these paths become stale (pointing to old .swamp/bundles/ instead of the new cache path). Added:

  • getDatastoreBasePath() / setDatastoreBasePath() on the catalog store
  • Detection in buildIndex() that compares the stored base path against the current resolver path, auto-invalidating the catalog when they differ

5. Architecture boundary ratchet

The new models -> datastore type-only import (import type { DatastorePathResolver }) creates a mutual dependency with the existing datastore -> models import. Updated the ratchet from 11 to 12.

Test Plan

Automated Tests (all 4140 pass)

  • 3 new resolver tests (default_datastore_path_resolver_test.ts):

    • Bundle subdirs are recognized as datastore subdirs (all 5 types)
    • resolvePath() routes bundles to S3 cache path for custom datastores
    • Filesystem datastore resolves bundles to same local path (no behavioral change)
  • 1 new constructor test (user_model_loader_test.ts):

    • UserModelLoader accepts optional DatastorePathResolver (with and without)
  • All 4140 existing tests pass — the optional resolver with fallback preserves backward compatibility

Manual End-to-End Verification

Reproduced the exact scenario from the issue:

  1. Initialized scratch repo at /tmp/swamp-repro-issue-1100/
  2. Pulled @swamp/aws/ec2 extension — bundles written to .swamp/bundles/2e4ea9ae/
  3. Ran model method — succeeded (AWS auth error, not file-not-found — confirming bundles load correctly)
  4. Simulated S3 migration:
    • Moved .swamp/bundles/ to /tmp/swamp-s3-cache-1100/bundles/
    • Configured filesystem datastore pointing to /tmp/swamp-s3-cache-1100/ in .swamp.yaml
  5. Ran model method with OLD binary (pre-fix) — FAILED with No such file or directory (os error 2): readfile '/private/tmp/swamp-repro-issue-1100/.swamp/bundles/2e4ea9ae/vpc.js' — confirming the bug
  6. Ran model method with NEW binary (this fix) — SUCCEEDED — the resolver correctly routed to /tmp/swamp-s3-cache-1100/bundles/2e4ea9ae/vpc.js
  7. Verified catalog auto-invalidation — on first run after migration, catalog detected the base path change, invalidated stale entries, and rebuilt with correct paths. Subsequent runs used the updated catalog without re-invalidation.

Risk Assessment

  • Filesystem datastore users: Zero behavioral change — resolver returns the same .swamp/ path
  • No datastore configured: Resolver is undefined, falls back to hardcoded path
  • Datastore loader bootstrap: Explicitly excluded from resolver wiring — datastore extensions always load from local path
  • No data migration: Catalog paths are recomputed on each buildIndex() call; auto-invalidation handles stale entries

🤖 Generated with Claude Code

…ores (#1100)

After migrating from filesystem to S3 datastore, model method execution
failed because all extension bundle loaders hardcoded bundle paths to
local `.swamp/bundles/` instead of resolving through the
DatastorePathResolver. The resolver already correctly handled path
routing for data artifacts, workflow runs, and secrets — but bundles
were bypassed.

Changes:

- Add `driver-bundles`, `datastore-bundles`, `report-bundles` to
  DEFAULT_DATASTORE_SUBDIRS so the resolver routes all bundle types
  to the datastore tier

- Add optional DatastorePathResolver parameter to UserModelLoader,
  UserVaultLoader, UserDriverLoader, and UserReportLoader constructors
  with a `resolveBundlePath()` helper that delegates to the resolver
  when present, falling back to the local path when absent

- Replace all hardcoded `join(repoDir, SWAMP_DATA_DIR, SWAMP_SUBDIRS.*)`
  bundle path constructions with `resolveBundlePath()` calls across all
  4 loaders (model, vault, driver, report)

- Wire a lazy resolver factory through mod.ts that constructs the
  resolver on first loader invocation and shares it across all loaders.
  UserDatastoreLoader is explicitly excluded (bootstrap ordering — it
  loads datastore extensions that configure the resolver itself)

- Add datastore base path tracking to ExtensionCatalogStore so stale
  catalog entries are auto-invalidated when the datastore changes
  (e.g. after migration from filesystem to S3)

- Update assertSafePath boundary checks to use the resolver-aware
  bundle base path

Co-authored-by: Sean Escriva <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Well-structured bug fix that correctly wires DatastorePathResolver into the 4 extension bundle loaders (model, vault, driver, report) while correctly excluding the datastore loader due to bootstrap ordering. The lazy resolver factory with caching in mod.ts is a clean approach.

Blocking Issues

None.

Suggestions

  1. Silent error swallowing in lazyResolver (src/cli/mod.ts): The .catch(() => undefined) on resolveDatastoreConfig silently falls back to local paths for any error, including legitimate configuration problems (e.g., malformed .swamp.yaml). Consider logging a warning in the catch handler so users can diagnose misconfiguration, rather than silently falling back.

  2. Catalog invalidation test coverage: The new getDatastoreBasePath()/setDatastoreBasePath() methods on ExtensionCatalogStore and the invalidation logic in buildIndex() (lines ~497-510 in user_model_loader.ts) aren't directly tested. A unit test that populates the catalog, changes the base path, and verifies re-invalidation would guard against regressions in this logic.

  3. resolveBundlePath() duplication: The four loaders (model, vault, driver, report) have near-identical resolveBundlePath() implementations. This is fine for now given the fix scope, but if bundle path resolution grows more complex, consider extracting a shared helper.

  4. DDD: The type-only import from models -> datastore (bumping the ratchet to 12) is the right call — it keeps the coupling lightweight (no runtime dependency) while enabling the resolver injection. The domain interface (DatastorePathResolver) stays in the datastore bounded context where it belongs, and loaders depend on the abstraction rather than the infrastructure implementation. Clean dependency inversion.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found. The code is well-structured and the guards are correct.

Medium

  1. src/cli/mod.ts:644 — Silent catch-all swallows real configuration errors. The lazy resolver factory uses .catch(() => undefined), which treats "no datastore configured" identically to "malformed YAML", "permission denied on config file", or "S3 auth failure". If a user has a broken datastore config, they'll silently fall back to local paths with no indication that their S3 datastore is being bypassed. Consider catching only the expected "no config" error class and logging a warning for unexpected failures:

    .catch((err) => {
      if (!(err instanceof NoDatastoreConfiguredError)) { // or whatever the expected error type is
        logger.warn`Failed to create datastore path resolver, falling back to local paths: ${err}`;
      }
      return undefined;
    });
  2. src/domain/models/user_model_loader.ts:504-512 — First run after upgrade forces catalog invalidation on ALL installations. On the first run after this PR ships, getDatastoreBasePath() returns undefined (never been set) while currentBasePath is a non-empty string, so undefined !== "/repo/.swamp/bundles" is true. This means every existing installation will get a one-time forced catalog rescan on upgrade, even for users who have never used S3. This is a one-time cost and the comment + design make it clear this is intentional for correctness. Mentioning it only because it's non-obvious — it may cause a noticeable first-run delay on repos with many extensions.

Low

  1. resolveBundlePath() returns "" when repoDir is null. All four loaders have this path (if (!this.repoDir) return ""). While currently all callers guard with if (this.repoDir) before reaching resolveBundlePath(), passing "" to assertSafePath as a boundary would resolve to the CWD and cause a spurious PathTraversalError. This is safe today due to the guards, but the empty-string return is a foot-gun if future callers forget the guard. A defensive alternative would be to throw instead of returning "", since a null repoDir reaching this point would always be a programming error.

  2. assertSafePath boundary is now narrower. Previously bundleBoundary = join(repoDir, SWAMP_DATA_DIR) (e.g., /repo/.swamp), now it's resolveBundlePath() (e.g., /repo/.swamp/bundles). This is strictly tighter — a security improvement, not a concern. Noting it because it's a behavioral change in path validation that could theoretically reject a path that the old code accepted, though no such path should exist in practice.

Verdict

PASS — Clean, well-guarded refactor. The resolver is correctly excluded from the datastore loader bootstrap, all four loaders follow a consistent pattern, the catalog invalidation is sound, and the fallback behavior preserves backward compatibility. The silent error swallowing (Medium #1) is the most actionable item but does not block.

@stack72 stack72 merged commit f086cc8 into main Apr 4, 2026
10 checks passed
@stack72 stack72 deleted the fix/1100-s3-datastore-bundle-path-resolution branch April 4, 2026 21:09
stack72 added a commit that referenced this pull request Apr 6, 2026
After rebasing onto main (which added DatastorePathResolver in #1103),
update the new catalog-related methods (populateCatalogFromRegistry,
getVaultBundlePath, getDriverBundlePath, getReportBundlePath) to use
this.resolveBundlePath() instead of hardcoded join() paths. This ensures
lazy loading works correctly with S3 datastores.

The datastore loader intentionally keeps hardcoded local paths due to
bootstrap ordering constraints.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

S3 datastore: bundle path resolution fails after datastore migration

1 participant