Skip to content

feat(drivers): finish repo-level defaultDriver — direct path + definition tier (swamp-club#165)#1221

Merged
stack72 merged 1 commit intomainfrom
feat/repo-driver-cleanup-165
Apr 24, 2026
Merged

feat(drivers): finish repo-level defaultDriver — direct path + definition tier (swamp-club#165)#1221
stack72 merged 1 commit intomainfrom
feat/repo-driver-cleanup-165

Conversation

@keeb
Copy link
Copy Markdown
Contributor

@keeb keeb commented Apr 24, 2026

Summary

Closes swamp-club#165. Completes the cleanup gaps left by #1219 so the repo-level defaultDriver priority chain now applies uniformly to both workflow and direct swamp model method run paths. Also fixes a concurrent-read race on the marker load and un-breaks the silently-dropped definition.driver tier.

Three deliverables, one commit:

  • Direct swamp model method run now honors defaultDriver. src/libswamp/models/run.ts routes through resolveDriverConfig with cli/definition/repo tiers. Serve/WS (src/serve/connection.ts) and HTTP SSE (src/serve/open/http.ts) paths inherit for free. A new loadRepoMarker dep on ModelMethodRunDeps is built via the extracted createRepoMarkerLoader helper.
  • Promise-memoize the marker read. Previously WorkflowExecutionService.loadRepoMarker cached the resolved value — two concurrent callers both saw undefined and both hit the file. Now this.markerPromise ??= markerRepo.read(...); .swamp.yaml is read at most once per service instance. The primitive is extracted to src/infrastructure/persistence/repo_marker_loader.ts and reused by all four factories (workflow service, CLI, serve/WS, HTTP SSE).
  • Make the definition tier live. src/domain/workflows/execution_service.ts:523 used to read ctx.driver ?? evaluatedDefinition.driver — dead code, because ctx.driver was always populated by the old upstream resolveDriverConfig call (with "raw" as its fallback), so the RHS never fired and definition.driver was silently dropped. StepExecutionContext now carries unresolved driverTiers: { cli, step, job, workflow, repo } and final resolution happens inside DefaultStepExecutor.executeModelMethod where evaluatedDefinition is in scope.

Behavior changes

  • swamp model method run <m> <meth> in a repo with defaultDriver: docker in .swamp.yaml now runs via docker without --driver. --driver raw override still wins.
  • Model definitions with driver: set at the definition level are now honored during workflow execution. Internal blast radius: zero — grep of extensions/**/manifest.yaml and extensions/**/*.yaml returned no hits. Third-party extensions that set driver: at the definition level will now apply it (this was the documented design from day one but was never wired).
  • Malformed .swamp.yaml now aborts every direct method run with a YAML parse error (previously direct runs didn't read the marker at all). Consistent with workflow behavior and documented in design/execution-drivers.md.

Tests added

  • src/infrastructure/persistence/repo_marker_loader_test.ts — concurrency primitive: two simultaneous calls through a Promise.withResolvers latch, assert exactly one underlying .read() call. Covers null-result caching too.
  • src/domain/models/method_execution_service_test.ts — pre-flight check receives the resolved driver/driverConfig from MethodContext (propagation audit).
  • src/libswamp/models/run_test.ts — four direct-path resolution tests: defaultDriver from marker wins; CLI override wins; raw fallback; definition tier applies when no higher tier set.
  • src/domain/workflows/execution_service_test.ts — existing three workflow-path resolution tests updated to assert on the new driverTiers shape.
  • integration/driver_resolution_test.ts — end-to-end CLI smoke (no docker required): defaultDriver honored, CLI override wins, malformed YAML aborts loudly.

Docs

  • design/execution-drivers.md — added a short note under "Resolution Priority" stating that the priority chain applies uniformly to workflow and direct paths and that malformed .swamp.yaml aborts at start-of-run.
  • Follow-up docs gap (not in this PR): the swamp-club manual reference at content/manual/reference/repository-configuration.md still does not document defaultDriver / defaultDriverConfig — pre-existing gap from feat(drivers): repo-level defaultDriver in .swamp.yaml (swamp-club#159) #1219. systeminit/swamp-club has GitHub issues disabled; flagging here so the gap can be routed to the swamp-club lab.

Out of scope (per issue body, separate follow-ups)

  • Named-fields refactor of resolveDriverConfig (positional DriverSource | undefined params).
  • WorkflowExecutionService.execute() option forwarding (currently drops driver).
  • Runtime schema validation of RepoMarkerData (the parseYaml(...) as RepoMarkerData cast).
  • UAT coverage for --driver / defaultDriver in systeminit/swamp-uat.

Test plan

  • deno check — passes
  • deno lint — passes
  • deno fmt --check — passes
  • deno run test — 4823/4823 pass
  • deno run compile — binary produced
  • Local smoke: scratch repo with defaultDriver: raw → honored; --driver raw beats defaultDriver: docker; malformed YAML aborts with exit code 1.

🤖 Generated with Claude Code

…finition tier live (swamp-club#165)

Closes swamp-club#165. Completes the gaps left by #1219 so the repo-level
`defaultDriver` priority chain applies uniformly across workflow and
direct model-method-run paths.

## Changes

- Extract `createRepoMarkerLoader` — a shared promise-memoized loader for
  `.swamp.yaml`. Used by `WorkflowExecutionService` (request-scoped field)
  and by the CLI / serve / HTTP factories for the new
  `ModelMethodRunDeps.loadRepoMarker` dep.
- `WorkflowExecutionService`: promise-memoize the marker read (fixes the
  concurrent-read race where two callers both saw `undefined`).
- `StepExecutionContext`: carry unresolved `driverTiers: { cli, step, job,
  workflow, repo }` instead of a resolved driver. Final resolution moves
  into `DefaultStepExecutor.executeModelMethod` where `evaluatedDefinition`
  is in scope — this makes the `definition` tier actually apply. Dead-code
  `ctx.driver ?? evaluatedDefinition.driver` pair at :523-524 removed.
- `modelMethodRun`: call `resolveDriverConfig` with cli/definition/repo
  tiers (no workflow/job/step outside a workflow) and pass resolved values
  into `buildMethodContext`. Serve/WS and HTTP paths inherit for free.

## Behavior changes

- Direct `swamp model method run` now honors `defaultDriver` from
  `.swamp.yaml`. CLI `--driver` override still wins.
- Model definitions with `driver:` set at the definition level are now
  honored during workflow execution (previously silently dropped). Grep
  shows zero internal `extensions/` models rely on this. Third-party
  extensions that declared `driver:` at the definition level will now
  apply it — intended and matches the documented priority chain.
- Malformed `.swamp.yaml` now aborts every direct method run with a YAML
  parse error, matching workflow behavior. Documented in
  `design/execution-drivers.md`.

## Tests

- `repo_marker_loader_test.ts`: concurrency primitive (`Promise.withResolvers`
  latched double reads — asserts one underlying `.read()` call across two
  concurrent invocations).
- `method_execution_service_test.ts`: pre-flight check receives the resolved
  `driver` / `driverConfig` from the MethodContext.
- `run_test.ts`: four direct-path resolution tests (defaultDriver wins, CLI
  override, raw fallback, definition tier).
- `execution_service_test.ts`: existing three workflow-path resolution tests
  updated for the `driverTiers` shape.
- `integration/driver_resolution_test.ts`: end-to-end CLI smoke — defaultDriver
  honored, CLI override wins, malformed YAML aborts loudly.

## Docs

- `design/execution-drivers.md`: note that the priority chain applies
  uniformly to workflow and direct paths, and that malformed `.swamp.yaml`
  aborts at start-of-run.
- Follow-up docs gap (not in this PR): the swamp-club manual reference at
  `content/manual/reference/repository-configuration.md` does not document
  `defaultDriver` / `defaultDriverConfig` — pre-existing gap from #1219.
  `systeminit/swamp-club` has GitHub issues disabled; flagging here so
  the gap can be routed to the swamp-club lab.

## Out of scope (separate follow-ups, per issue body)

- Named-fields refactor of `resolveDriverConfig`
- `WorkflowExecutionService.execute()` option forwarding
- Runtime schema validation of `RepoMarkerData`
- UAT coverage for `--driver` / `defaultDriver`

## Test plan

- [x] `deno check` — passes
- [x] `deno lint` — passes
- [x] `deno fmt --check` — passes
- [x] `deno run test` — 4823/4823 pass
- [x] `deno run compile` — binary produced
- [x] Local smoke: `swamp model method run` in scratch repo with
      `defaultDriver: raw` honors default; `--driver raw` overrides
      `defaultDriver: docker`; malformed `.swamp.yaml` aborts.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. Malformed .swamp.yaml error lacks file context (src/infrastructure/persistence/repo_marker_repository.ts, src/infrastructure/persistence/repo_marker_loader.ts): When .swamp.yaml has a YAML parse error, the user sees Method execution failed: <raw yaml parse error> (e.g. Method execution failed: unexpected end of the stream within a single-quoted scalar at line 1, column 19). The word "yaml" does appear, satisfying the integration test, but the user isn't told which file to fix. Wrapping the thrown error in repo_marker_repository.ts#read() or repo_marker_loader.ts with a message like Failed to parse .swamp.yaml: <original message> would make this immediately actionable. Not blocking because the run aborts correctly with non-zero exit and "yaml" surfaces in the message.

  2. Behavioral change is silent at runtime (src/cli/commands/model_method_run.ts): swamp model method run now reads .swamp.yaml on every invocation and silently applies defaultDriver. Users who previously relied on the implicit raw fallback (because the direct path never read the marker) won't know their run switched drivers. A --verbose log line like using driver: docker (from .swamp.yaml defaultDriver) would surface this. Not blocking — the design docs cover it — but could save a confused support question.

Verdict

PASS — No UX-visible regressions. Changes are internal driver-resolution plumbing. Malformed-YAML abort is new but intentional and tested; the existing Method execution failed: wrapper propagates "yaml" in the message so users can diagnose the problem.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. src/serve/open/http.ts:1819-1824buildRunDeps creates the repo-marker loader once per handleRun call, but the comment on WorkflowExecutionService.loadRepoMarker says "instance-scoped so a long-running serve process creates a fresh loader per request." Both serve paths (WS at connection.ts:262 and HTTP SSE at http.ts:1768) do construct deps per request, so this is currently correct. However, if buildRunDeps were ever lifted to the route-initialization scope (creating deps once for the server), the loader would cache the .swamp.yaml state from the first request forever. The current code is fine — this is a future-maintenance concern only.

  2. src/domain/workflows/execution_service.ts:1869 — CLI tier omits driverConfig. The workflow path constructs the CLI tier as { driver: options.driver } with no driverConfig. This means a CLI user who specifies --driver docker cannot pass --driver-config at the workflow level. This is pre-existing behavior unchanged by this PR (the old code also only passed { driver: options.driver } to resolveDriverConfig), not a regression. Flagging for awareness.

Low

  1. src/infrastructure/persistence/repo_marker_loader.ts:43 — rejected promises are cached. If markerRepo.read() rejects (e.g., malformed YAML), the rejected promise is permanently cached by the ??= operator. The PR documents this as intentional ("marker parse failures are typically permanent"). This is correct for the current use — malformed YAML won't self-heal — but if a transient I/O error (disk hiccup, NFS timeout) triggers a rejection, all subsequent calls will also reject without retry. In practice, transient I/O failures during a single CLI invocation are extremely unlikely, and the loader is scoped per-request, so this is theoretical.

  2. src/domain/workflows/execution_service.ts:1168markerRepo field only used in constructor. The private readonly markerRepo = new RepoMarkerRepository() field is only referenced at line 1199 to pass into createRepoMarkerLoader. It could be a local variable inside the constructor. Not a bug, just a minor cleanup opportunity.

  3. integration/driver_resolution_test.ts:106crypto.randomUUID() for model ID. The test generates a random UUID for each model definition. If the test ever needs to assert on the model ID, this would be non-deterministic. In the current tests, only modelName is asserted, so this is fine.

Verdict

PASS. The PR is well-structured and correct. The core change — deferring driver resolution to the step executor where the evaluated definition is in scope — is the right fix for the dead-code ctx.driver ?? evaluatedDefinition.driver pattern. The promise-memoization primitive is clean and well-tested. The six-tier resolveDriverConfig call in the step executor correctly mirrors the documented priority chain. No logic errors, no regressions, no security issues found.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Clean, well-structured PR that correctly fixes the three issues (direct-path driver resolution, concurrent-read race, dead definition.driver tier). The approach of deferring resolution into DefaultStepExecutor.executeModelMethod (where evaluatedDefinition is in scope) is the right design — it eliminates the dead-code ctx.driver ?? evaluatedDefinition.driver pair and makes the priority chain honest.

Blocking Issues

None.

Suggestions

  1. DriverTiers interface placement (execution_service.ts:113): This interface is semantically about driver resolution, not workflow execution. Co-locating it with DriverSource and ResolvedDriverConfig in src/domain/drivers/driver_resolution.ts would keep all resolution-related types together. Minor — works fine where it is.

  2. as any in run_test.ts:425-426: The (deps as any).createExecutionService = ... pattern adds a new no-explicit-any lint override. The existing test file already has 9 such overrides so this follows the established pattern, but a typed partial-mock helper could reduce any usage across the file in a future cleanup.

DDD Assessment

  • createRepoMarkerLoader correctly lives in src/infrastructure/persistence/ — it's a persistence-layer concern (promise-memoized file read).
  • DriverTiers is a well-formed value object interface capturing unresolved tiers.
  • modelMethodRun in src/libswamp/models/run.ts acts as an application service orchestrating domain objects — the driver resolution call sits at the right layer.
  • Dependency direction is clean: infrastructure → domain, application → domain + infrastructure. No inversion.

Other Observations

  • All new files include AGPLv3 license headers.
  • The libswamp import boundary is respected — CLI and serve paths import from infrastructure, not from libswamp internals.
  • Test coverage is thorough: unit tests for the concurrency primitive, propagation audit, four direct-path resolution scenarios, updated workflow-path assertions, and an integration smoke test.
  • The buildRunDeps call in http.ts is per-request (inside handleRun), confirming the per-request loader scoping claim.
  • The promise-memoize correctly caches both resolved values and null (absent file), with rejected reads staying cached since malformed YAML is permanent.

@stack72 stack72 merged commit e40c597 into main Apr 24, 2026
10 checks passed
@stack72 stack72 deleted the feat/repo-driver-cleanup-165 branch April 24, 2026 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants