fix: skip extension loading for help/version commands, parallelize loaders (#738) by stack72 · Pull Request #741 · systeminit/swamp

stack72 · 2026-03-17T21:01:37Z

Problem

Every CLI invocation — including swamp --help — was unconditionally loading all user extensions before running any command. This caused a ~42 second startup time, making even basic usage feel broken.

The root cause was four sequential loader calls, each independently:

Creating a RepoMarkerRepository and reading the marker file from disk
Creating an EmbeddedDenoRuntime and calling ensureDeno()
Spawning deno bundle subprocesses to compile extension files

With 4 sequential loaders and potentially many extension files, startup time was dominated by this overhead even when no extensions were needed.

Changes

1. Skip extension loading for commands that don't need it

Added commandNeedsExtensions() which checks the pre-parsed command against a set of commands that never use user extensions:

"", "help", "version", "completions", "init", "update", "auth", "telemetry", "issue"

swamp --help, swamp version, swamp completions bash, etc. are now instant.

2. Read marker and runtime once, share across all loaders

Previously each of the 4 loadUser* functions independently read the repo marker and created an EmbeddedDenoRuntime. Now the marker is read once (reusing the value already needed for resolveLogLevel) and a single EmbeddedDenoRuntime instance is shared — eliminating 3 redundant disk reads and 3 redundant runtime instantiations.

3. Run all 4 loaders in parallel

Replaced sequential await calls with Promise.all():

await Promise.all([
  loadUserModels(repoDir, marker, denoRuntime),
  loadUserVaults(repoDir, marker, denoRuntime),
  loadUserDrivers(repoDir, marker, denoRuntime),
  loadUserDatastores(repoDir, marker, denoRuntime),
]);

This is safe because each loader writes to its own registry, reads from a separate source directory, and writes to a separate bundle cache directory. ensureDeno() is idempotent (checks a version marker file before extracting). JavaScript's single-threaded event loop means Map.set() calls within registries are atomic within a tick.

User Impact

Command	Before	After
`swamp --help`	~42 seconds	< 1 second
`swamp version`	~42 seconds	< 1 second
`swamp completions bash`	~42 seconds	< 1 second
`swamp model type search aws`	~42 seconds	significantly faster (parallel loading)

Commands that need extensions (e.g. model, workflow, data, vault) still load all extensions, but now in parallel rather than sequentially.

Files Changed

src/cli/mod.ts — all changes in this single file

Test Plan

deno check — type checking passes
deno lint — no lint errors
deno fmt --check — formatting correct
deno run test — 3156 tests passed, 0 failed
deno run compile — binary compiled successfully

Fixes #738

…aders Every CLI invocation was unconditionally loading all user extensions (models, vaults, drivers, datastores) — each spawning deno bundle subprocesses and creating its own EmbeddedDenoRuntime. With 4 sequential loaders this caused ~42 second startup times, even for swamp --help. - Add commandNeedsExtensions() to skip loading for commands that never need user extensions: help, version, completions, init, update, auth, telemetry, issue - Read the repo marker once and share a single EmbeddedDenoRuntime across all loaders, eliminating 3 redundant marker reads and 3 redundant runtime instantiations - Run all 4 loaders in parallel with Promise.all() instead of sequential awaits Fixes #738 Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

Address review feedback on PR #741: - Add 14 unit tests for commandNeedsExtensions() covering skip commands (help, version, completions, init, update, auth, telemetry, issue, bare invocation) and commands that need extensions (model, workflow, data), plus global flag variants - Fix race condition in EmbeddedDenoRuntime.ensureDeno(): concurrent callers (e.g. the 4 parallel extension loaders) all saw cachedPath=null on the same tick and each spawned their own extraction subprocess. Add extractionPromise field so all concurrent callers share a single extraction and await the same promise. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

github-actions

Review Summary

This is a well-implemented performance optimization that fixes the ~42 second startup time for swamp --help and other commands that don't need user extensions. The changes are clean, safe, and well-tested.

No Blocking Issues Found

Code Quality ✓

TypeScript strict mode: No any types introduced, proper typing throughout
Named exports: commandNeedsExtensions correctly exported for testing
License headers: All files have AGPLv3 headers

DDD Compliance ✓

EmbeddedDenoRuntime is appropriately in the infrastructure layer
Loader functions remain application-level orchestration
No domain logic violations - purely CLI/infrastructure optimization

Test Coverage ✓

Comprehensive tests added for commandNeedsExtensions():
- All 9 skip commands tested
- Commands that need extensions tested (model, workflow, data)
- Global flags before commands tested
Existing EmbeddedDenoRuntime tests verify caching behavior

Correctness ✓

SKIP_EXTENSION_COMMANDS: Conservative list of commands that definitely don't need extensions
Parallel Promise.all(): Safe because each loader writes to its own registry and has internal try-catch (never throws)
Shared EmbeddedDenoRuntime: Safe with new extractionPromise caching to prevent concurrent extractions
Shared marker: Good optimization - reads once instead of 4 times

Security ✓

No vulnerabilities introduced - purely a performance optimization with no new input handling.

Suggestions (Non-blocking)

Minor: Consider adding a test case for commandNeedsExtensions(["--help"]) to explicitly verify this edge case (though it's implicitly covered since extractCommandInfo(["--help"]) returns {command: ""} which matches the empty string in SKIP_EXTENSION_COMMANDS).

CI passing (lint, test, format, dependency audit). Good to merge!

github-actions

Adversarial Review

I systematically attempted to break this code across logic, error handling, concurrency, and data integrity dimensions.

Critical / High

None found.

Medium

1. Two separate EmbeddedDenoRuntime instances are created (src/cli/mod.ts:510 and src/cli/mod.ts:538)

When both commandNeedsExtensions(args) is true AND trustedCollectives.length > 0 && marker is true (the common case), two separate EmbeddedDenoRuntime instances are created:

// Line 510 - for extension loading
if (commandNeedsExtensions(args)) {
  const denoRuntime = new EmbeddedDenoRuntime();  // instance 1
  await Promise.all([...]);
}

// Line 538 - for auto-resolver
if (trustedCollectives.length > 0 && marker) {
  const denoRuntime = new EmbeddedDenoRuntime();  // instance 2
  ...
}

While functionally correct (the file-level version marker prevents double extraction), this means:

Redundant object instantiation
Each instance maintains its own cachedPath and extractionPromise state
The second instance will re-check the disk even though extraction is already complete

Suggested fix: Move EmbeddedDenoRuntime creation before the if (commandNeedsExtensions(args)) block and share the single instance across both usages:

const denoRuntime = new EmbeddedDenoRuntime();

if (commandNeedsExtensions(args)) {
  await Promise.all([
    loadUserModels(repoDir, marker, denoRuntime),
    ...
  ]);
}
// ... later ...
if (trustedCollectives.length > 0 && marker) {
  setAutoResolver(new ExtensionAutoResolver({
    extensionInstaller: createAutoResolveInstallerAdapter({
      denoRuntime,  // reuse same instance
      ...
    }),
  }));
}

Low

1. Test coverage gap for --option=value syntax (src/cli/mod_test.ts)

The tests verify --json (a boolean flag in GLOBAL_OPTIONS) works correctly, but don't test value-taking options with = syntax like --log-level=debug. The underlying extractCommandInfo() function has a pre-existing parsing quirk where --option=value style args incorrectly skip the next positional argument.

Example: ["--log-level=debug", "model"] would parse command as "" because model is skipped as a presumed option value.

This doesn't affect extension loading correctness (empty command still skips extensions, which is correct behavior for that edge case), but a test documenting this behavior would be valuable.

2. TOCTOU in file-level extraction check (src/infrastructure/runtime/embedded_deno_runtime.ts:74-85)

The version marker check followed by extraction has a theoretical race:

const existingVersion = (await Deno.readTextFile(versionMarker)).trim();
if (existingVersion === embeddedVersion.value) {
  await Deno.stat(targetBinary);
  return targetBinary;  // early return
}
// ... proceeds to extract

If two separate EmbeddedDenoRuntime instances (from the Medium issue above) both check simultaneously before either completes extraction, both could proceed to extract. This is harmless (identical content written twice) but wasteful. The promise-sharing fix within a single instance is correct; the issue only manifests with multiple instances.

Verdict

PASS — This is a well-implemented performance optimization. The parallel loading via Promise.all() is safe (separate directories and registries), the extraction promise sharing correctly prevents redundant subprocess spawns within an instance, and error handling is robust (all loaders catch errors internally). The Medium issue is a minor efficiency concern, not a correctness bug.

…ecution (#743) ## Problem With 104+ user extension models, `swamp model type search` (and any other command that loads extensions) took **~46 seconds** to start. After the fix in #741 that parallelised the 4 loader types and skipped loading for help/version commands, the remaining bottleneck was in `UserModelLoader` itself. At startup, `loadModels()` eagerly built a **self-contained bundle** for every model file — a separate `deno bundle` subprocess per model that inlines all dependencies (including zod) so the bundle can run inside Docker containers without network access. With 104 models, this was 104 sequential subprocesses on every single CLI invocation, regardless of whether Docker execution would ever be used. ## Architecture Decision & Tradeoffs ### What changed `bundleSource?: string` on `ModelDefinition` (a pre-built JS string stored at load time) has been replaced with `bundleSourceFactory?: () => Promise<string>` — a memoizing closure that defers the expensive work to the point of actual need. ```typescript // Before: runs at startup, for every model, every invocation modelDef.bundleSource = await bundleExtension(absolutePath, denoPath, { selfContained: true }); // After: closure set at load time, executed only on first Docker execution let cachedBundle: string | undefined; modelDef.bundleSourceFactory = async () => { if (!cachedBundle) { cachedBundle = await bundleExtension(absolutePath, denoPath, { selfContained: true }); } return cachedBundle; }; ``` ### Tradeoff: startup cost vs. first-execution cost | | Before | After | |---|---|---| | Every CLI invocation | Pays bundling cost for all N models | Pays nothing | | First Docker execution of model A | Already paid at startup | Pays bundling cost for model A only | | Second Docker execution of model A | Pre-built | Memoized in-process — instant | The first out-of-process execution of a given model will be slightly slower than before — it now bundles on demand rather than having it pre-built. This is the right tradeoff because: 1. **Docker execution is rare** relative to everyday CLI usage (`type search`, `model get`, `data list`, etc.) 2. **Cost is proportional to actual need** — only the models you actually run out-of-process are ever bundled 3. **Memoization** ensures the cost is paid at most once per model per process invocation 4. **The startup tax was paid unconditionally** regardless of what command you ran — even read-only commands that never touch Docker ### Why only models? Vaults, drivers, and datastores do not create self-contained bundles at all — they only use the externalized (cached) bundle for in-process execution. This change is model-specific because only models support out-of-process/Docker execution via `bundleSource`. ## User Impact **Measured on a real repo with 104 extension models:** | Command | Before | After | |---|---|---| | `swamp model type search aws` | ~46 seconds | **~2.8 seconds** | | `swamp model get my-model` | ~46 seconds | **~2.8 seconds** | | `swamp data list` | ~46 seconds | **~2.8 seconds** | | First Docker `model run` | Instant (pre-built) | ~same as before (built on demand) | **16x speedup** for the everyday command path. The remaining ~2.8s is the warm-cache cost of the externalized bundle loading (disk reads + dynamic imports for 104 files), which is a separate optimization opportunity. ## Files Changed | File | Change | |---|---| | `src/domain/models/model.ts` | `bundleSource?: string` → `bundleSourceFactory?: () => Promise<string>` | | `src/domain/models/user_model_loader.ts` | Remove eager bundling, set memoizing factory closure | | `src/domain/models/method_execution_service.ts` | Await `bundleSourceFactory?.()` at execution time | ## Test Plan - [x] `deno check` — type checking passes - [x] `deno lint` — no lint errors - [x] `deno fmt --check` — formatting correct - [x] `deno run test` — 3175 tests passed, 0 failed - [x] `deno run compile` — binary compiled successfully - [x] Manual: `swamp model type search` in 104-model repo — 46s → 2.8s --------- Co-authored-by: Claude Sonnet 4.6 <[email protected]>

This comment was marked as outdated.

Sign in to view

github-actions Bot approved these changes Mar 17, 2026

View reviewed changes

github-actions Bot reviewed Mar 17, 2026

View reviewed changes

stack72 merged commit ede9f2a into main Mar 17, 2026
6 checks passed

stack72 deleted the fix/slow-help-startup branch March 17, 2026 21:20

This was referenced Mar 17, 2026

perf: defer self-contained bundle creation to first out-of-process execution #743

Merged

Persistent runner / server mode to eliminate per-invocation CLI startup overhead #730

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: skip extension loading for help/version commands, parallelize loaders (#738)#741

fix: skip extension loading for help/version commands, parallelize loaders (#738)#741
stack72 merged 2 commits intomainfrom
fix/slow-help-startup

stack72 commented Mar 17, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

stack72 commented Mar 17, 2026

Problem

Changes

1. Skip extension loading for commands that don't need it

2. Read marker and runtime once, share across all loaders

3. Run all 4 loaders in parallel

User Impact

Files Changed

Test Plan

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Review Summary

No Blocking Issues Found

Code Quality ✓

DDD Compliance ✓

Test Coverage ✓

Correctness ✓

Security ✓

Suggestions (Non-blocking)

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Adversarial Review

Critical / High

Medium

Low

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant