fix: skip extension loading for help/version commands, parallelize loaders (#738)#741
fix: skip extension loading for help/version commands, parallelize loaders (#738)#741
Conversation
…aders Every CLI invocation was unconditionally loading all user extensions (models, vaults, drivers, datastores) — each spawning deno bundle subprocesses and creating its own EmbeddedDenoRuntime. With 4 sequential loaders this caused ~42 second startup times, even for swamp --help. - Add commandNeedsExtensions() to skip loading for commands that never need user extensions: help, version, completions, init, update, auth, telemetry, issue - Read the repo marker once and share a single EmbeddedDenoRuntime across all loaders, eliminating 3 redundant marker reads and 3 redundant runtime instantiations - Run all 4 loaders in parallel with Promise.all() instead of sequential awaits Fixes #738 Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Address review feedback on PR #741: - Add 14 unit tests for commandNeedsExtensions() covering skip commands (help, version, completions, init, update, auth, telemetry, issue, bare invocation) and commands that need extensions (model, workflow, data), plus global flag variants - Fix race condition in EmbeddedDenoRuntime.ensureDeno(): concurrent callers (e.g. the 4 parallel extension loaders) all saw cachedPath=null on the same tick and each spawned their own extraction subprocess. Add extractionPromise field so all concurrent callers share a single extraction and await the same promise. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
There was a problem hiding this comment.
Review Summary
This is a well-implemented performance optimization that fixes the ~42 second startup time for swamp --help and other commands that don't need user extensions. The changes are clean, safe, and well-tested.
No Blocking Issues Found
Code Quality ✓
- TypeScript strict mode: No
anytypes introduced, proper typing throughout - Named exports:
commandNeedsExtensionscorrectly exported for testing - License headers: All files have AGPLv3 headers
DDD Compliance ✓
EmbeddedDenoRuntimeis appropriately in the infrastructure layer- Loader functions remain application-level orchestration
- No domain logic violations - purely CLI/infrastructure optimization
Test Coverage ✓
- Comprehensive tests added for
commandNeedsExtensions():- All 9 skip commands tested
- Commands that need extensions tested (model, workflow, data)
- Global flags before commands tested
- Existing
EmbeddedDenoRuntimetests verify caching behavior
Correctness ✓
SKIP_EXTENSION_COMMANDS: Conservative list of commands that definitely don't need extensions- Parallel
Promise.all(): Safe because each loader writes to its own registry and has internal try-catch (never throws) - Shared
EmbeddedDenoRuntime: Safe with newextractionPromisecaching to prevent concurrent extractions - Shared marker: Good optimization - reads once instead of 4 times
Security ✓
No vulnerabilities introduced - purely a performance optimization with no new input handling.
Suggestions (Non-blocking)
- Minor: Consider adding a test case for
commandNeedsExtensions(["--help"])to explicitly verify this edge case (though it's implicitly covered sinceextractCommandInfo(["--help"])returns{command: ""}which matches the empty string inSKIP_EXTENSION_COMMANDS).
CI passing (lint, test, format, dependency audit). Good to merge!
There was a problem hiding this comment.
Adversarial Review
I systematically attempted to break this code across logic, error handling, concurrency, and data integrity dimensions.
Critical / High
None found.
Medium
1. Two separate EmbeddedDenoRuntime instances are created (src/cli/mod.ts:510 and src/cli/mod.ts:538)
When both commandNeedsExtensions(args) is true AND trustedCollectives.length > 0 && marker is true (the common case), two separate EmbeddedDenoRuntime instances are created:
// Line 510 - for extension loading
if (commandNeedsExtensions(args)) {
const denoRuntime = new EmbeddedDenoRuntime(); // instance 1
await Promise.all([...]);
}
// Line 538 - for auto-resolver
if (trustedCollectives.length > 0 && marker) {
const denoRuntime = new EmbeddedDenoRuntime(); // instance 2
...
}While functionally correct (the file-level version marker prevents double extraction), this means:
- Redundant object instantiation
- Each instance maintains its own
cachedPathandextractionPromisestate - The second instance will re-check the disk even though extraction is already complete
Suggested fix: Move EmbeddedDenoRuntime creation before the if (commandNeedsExtensions(args)) block and share the single instance across both usages:
const denoRuntime = new EmbeddedDenoRuntime();
if (commandNeedsExtensions(args)) {
await Promise.all([
loadUserModels(repoDir, marker, denoRuntime),
...
]);
}
// ... later ...
if (trustedCollectives.length > 0 && marker) {
setAutoResolver(new ExtensionAutoResolver({
extensionInstaller: createAutoResolveInstallerAdapter({
denoRuntime, // reuse same instance
...
}),
}));
}Low
1. Test coverage gap for --option=value syntax (src/cli/mod_test.ts)
The tests verify --json (a boolean flag in GLOBAL_OPTIONS) works correctly, but don't test value-taking options with = syntax like --log-level=debug. The underlying extractCommandInfo() function has a pre-existing parsing quirk where --option=value style args incorrectly skip the next positional argument.
Example: ["--log-level=debug", "model"] would parse command as "" because model is skipped as a presumed option value.
This doesn't affect extension loading correctness (empty command still skips extensions, which is correct behavior for that edge case), but a test documenting this behavior would be valuable.
2. TOCTOU in file-level extraction check (src/infrastructure/runtime/embedded_deno_runtime.ts:74-85)
The version marker check followed by extraction has a theoretical race:
const existingVersion = (await Deno.readTextFile(versionMarker)).trim();
if (existingVersion === embeddedVersion.value) {
await Deno.stat(targetBinary);
return targetBinary; // early return
}
// ... proceeds to extractIf two separate EmbeddedDenoRuntime instances (from the Medium issue above) both check simultaneously before either completes extraction, both could proceed to extract. This is harmless (identical content written twice) but wasteful. The promise-sharing fix within a single instance is correct; the issue only manifests with multiple instances.
Verdict
PASS — This is a well-implemented performance optimization. The parallel loading via Promise.all() is safe (separate directories and registries), the extraction promise sharing correctly prevents redundant subprocess spawns within an instance, and error handling is robust (all loaders catch errors internally). The Medium issue is a minor efficiency concern, not a correctness bug.
…ecution (#743) ## Problem With 104+ user extension models, `swamp model type search` (and any other command that loads extensions) took **~46 seconds** to start. After the fix in #741 that parallelised the 4 loader types and skipped loading for help/version commands, the remaining bottleneck was in `UserModelLoader` itself. At startup, `loadModels()` eagerly built a **self-contained bundle** for every model file — a separate `deno bundle` subprocess per model that inlines all dependencies (including zod) so the bundle can run inside Docker containers without network access. With 104 models, this was 104 sequential subprocesses on every single CLI invocation, regardless of whether Docker execution would ever be used. ## Architecture Decision & Tradeoffs ### What changed `bundleSource?: string` on `ModelDefinition` (a pre-built JS string stored at load time) has been replaced with `bundleSourceFactory?: () => Promise<string>` — a memoizing closure that defers the expensive work to the point of actual need. ```typescript // Before: runs at startup, for every model, every invocation modelDef.bundleSource = await bundleExtension(absolutePath, denoPath, { selfContained: true }); // After: closure set at load time, executed only on first Docker execution let cachedBundle: string | undefined; modelDef.bundleSourceFactory = async () => { if (!cachedBundle) { cachedBundle = await bundleExtension(absolutePath, denoPath, { selfContained: true }); } return cachedBundle; }; ``` ### Tradeoff: startup cost vs. first-execution cost | | Before | After | |---|---|---| | Every CLI invocation | Pays bundling cost for all N models | Pays nothing | | First Docker execution of model A | Already paid at startup | Pays bundling cost for model A only | | Second Docker execution of model A | Pre-built | Memoized in-process — instant | The first out-of-process execution of a given model will be slightly slower than before — it now bundles on demand rather than having it pre-built. This is the right tradeoff because: 1. **Docker execution is rare** relative to everyday CLI usage (`type search`, `model get`, `data list`, etc.) 2. **Cost is proportional to actual need** — only the models you actually run out-of-process are ever bundled 3. **Memoization** ensures the cost is paid at most once per model per process invocation 4. **The startup tax was paid unconditionally** regardless of what command you ran — even read-only commands that never touch Docker ### Why only models? Vaults, drivers, and datastores do not create self-contained bundles at all — they only use the externalized (cached) bundle for in-process execution. This change is model-specific because only models support out-of-process/Docker execution via `bundleSource`. ## User Impact **Measured on a real repo with 104 extension models:** | Command | Before | After | |---|---|---| | `swamp model type search aws` | ~46 seconds | **~2.8 seconds** | | `swamp model get my-model` | ~46 seconds | **~2.8 seconds** | | `swamp data list` | ~46 seconds | **~2.8 seconds** | | First Docker `model run` | Instant (pre-built) | ~same as before (built on demand) | **16x speedup** for the everyday command path. The remaining ~2.8s is the warm-cache cost of the externalized bundle loading (disk reads + dynamic imports for 104 files), which is a separate optimization opportunity. ## Files Changed | File | Change | |---|---| | `src/domain/models/model.ts` | `bundleSource?: string` → `bundleSourceFactory?: () => Promise<string>` | | `src/domain/models/user_model_loader.ts` | Remove eager bundling, set memoizing factory closure | | `src/domain/models/method_execution_service.ts` | Await `bundleSourceFactory?.()` at execution time | ## Test Plan - [x] `deno check` — type checking passes - [x] `deno lint` — no lint errors - [x] `deno fmt --check` — formatting correct - [x] `deno run test` — 3175 tests passed, 0 failed - [x] `deno run compile` — binary compiled successfully - [x] Manual: `swamp model type search` in 104-model repo — 46s → 2.8s --------- Co-authored-by: Claude Sonnet 4.6 <[email protected]>
Problem
Every CLI invocation — including
swamp --help— was unconditionally loading all user extensions before running any command. This caused a ~42 second startup time, making even basic usage feel broken.The root cause was four sequential loader calls, each independently:
RepoMarkerRepositoryand reading the marker file from diskEmbeddedDenoRuntimeand callingensureDeno()deno bundlesubprocesses to compile extension filesWith 4 sequential loaders and potentially many extension files, startup time was dominated by this overhead even when no extensions were needed.
Changes
1. Skip extension loading for commands that don't need it
Added
commandNeedsExtensions()which checks the pre-parsed command against a set of commands that never use user extensions:swamp --help,swamp version,swamp completions bash, etc. are now instant.2. Read marker and runtime once, share across all loaders
Previously each of the 4
loadUser*functions independently read the repo marker and created anEmbeddedDenoRuntime. Now the marker is read once (reusing the value already needed forresolveLogLevel) and a singleEmbeddedDenoRuntimeinstance is shared — eliminating 3 redundant disk reads and 3 redundant runtime instantiations.3. Run all 4 loaders in parallel
Replaced sequential
awaitcalls withPromise.all():This is safe because each loader writes to its own registry, reads from a separate source directory, and writes to a separate bundle cache directory.
ensureDeno()is idempotent (checks a version marker file before extracting). JavaScript's single-threaded event loop meansMap.set()calls within registries are atomic within a tick.User Impact
swamp --helpswamp versionswamp completions bashswamp model type search awsCommands that need extensions (e.g.
model,workflow,data,vault) still load all extensions, but now in parallel rather than sequentially.Files Changed
src/cli/mod.ts— all changes in this single fileTest Plan
deno check— type checking passesdeno lint— no lint errorsdeno fmt --check— formatting correctdeno run test— 3156 tests passed, 0 faileddeno run compile— binary compiled successfullyFixes #738