Skip to content

feat(datastore): add --timeout flag and enrich SyncTimeoutError (swamp-club#164)#1222

Merged
stack72 merged 1 commit intomainfrom
datastore-sync-timeout-164
Apr 24, 2026
Merged

feat(datastore): add --timeout flag and enrich SyncTimeoutError (swamp-club#164)#1222
stack72 merged 1 commit intomainfrom
datastore-sync-timeout-164

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented Apr 24, 2026

Summary

Companion PR to the @swamp/s3-datastore fingerprint fast-path fix for lab/164 (minutes-slow swamp datastore sync at 4k-file scale, even on zero-diff). This PR adds the swamp-core ergonomics: a --timeout CLI flag for one-off sync overrides, and a richer SyncTimeoutError message so users get actionable next steps when the timeout does fire.

Either PR may land first — the enriched error uses version-free wording, so there is no ordering dependency.

What changed

--timeout <seconds> flag on swamp datastore sync

  • Per-invocation override; wins over CustomDatastoreConfig.syncTimeoutMs and SWAMP_DATASTORE_SYNC_TIMEOUT_MS.
  • Validated at the CLI boundary: positive integer, rejected ≤0, capped at 21600s (6h) — generous enough for initial seeding from a slow link, bulk recovery after an outage, or 100k-file migrations, while rejecting fat-finger misconfigs. The env var stays uncapped as the shell-session escape hatch.
  • Scoped to swamp datastore sync only. Write commands that trigger implicit sync keep their existing escape hatch (env var). Adding --timeout everywhere would inflate the CLI surface without commensurate benefit.
  • --help clearly describes precedence so users reach for the right knob.

Enriched SyncTimeoutError message

Before:
```
Datastore push to "@swamp/s3-datastore" timed out after 300000ms. If a prior process crashed without releasing the lock, run 'swamp datastore lock release --force' (add --model / for a specific model lock).
```

After:
```
Datastore push to "@swamp/s3-datastore" timed out after 300000ms. Try one of:
• Rerun with --timeout (e.g. --timeout 1800 for large one-off syncs).
• Set SWAMP_DATASTORE_SYNC_TIMEOUT_MS for the duration of your shell session.
• If you are on @swamp/s3-datastore or @swamp/gcs-datastore at scale, update to the latest extension — the fingerprint fast path short-circuits zero-diff syncs.
• If a prior process crashed without releasing the lock, run 'swamp datastore lock release --force' (add --model / for a specific model lock).
```

Version-free wording keeps the message stable across extension releases. No changes to the error's structured fields (label, direction, timeoutMs, cause).

Internal plumbing

  • resolveSyncTimeoutMs(config, overrideMs?) — new optional parameter at highest precedence. Non-positive overrides silently fall through to the next source.
  • createDatastoreSyncDeps(repoDir, resolver, options?) — new CreateDatastoreSyncDepsOptions interface with syncTimeoutMsOverride.

Docs

design/datastores.md — expanded "Sync Timeout" section with the 4-tier precedence chain and reference to the enriched error. New "Zero-Diff Fast Path (Extension Guidance)" subsection with backend-agnostic wording (fingerprint + dirty-flag sidecar pattern), so future extension authors have a breadcrumb. Notes that sync is not a content-integrity tool — use DatastoreVerifier for that.

What this does NOT do

  • Does not fix the 5-minute zero-diff sync itself — that's the @swamp/s3-datastore fingerprint fast-path PR. This PR is the ergonomics layer.
  • Does not change default behavior for anyone who doesn't pass --timeout or hit the timeout.
  • Does not close the AbortSignal plumbing gap in the s3 extension.

Test plan

  • deno check — clean
  • deno lint — clean (1091 files)
  • deno fmt — clean
  • deno run test — 4822 passed / 0 failed (19 new tests)
  • deno run compile — binary built
  • ./swamp datastore sync --help--timeout flag documented correctly
  • Reporter runs swamp datastore sync --timeout 60 against their 4k-file repo and confirms: (a) validation rejects 0/negative/>21600, (b) enriched error message renders all four remedies, (c) flag override wins over env var when both are set. Tracked on lab/164 as the verification gate.

New tests

File What it covers
src/domain/datastore/datastore_sync_service_test.ts Enriched message contents; version-free extension hint (regex guards against calendar-version and semver); structured field preservation; multi-line JSON round-trip.
src/domain/datastore/datastore_config_test.ts resolveSyncTimeoutMs precedence: override wins over config/env/default; non-positive override falls through; undefined preserves existing precedence.
src/cli/commands/datastore_sync_test.ts parseTimeoutFlag validation: seconds→ms conversion, upper-bound acceptance, zero/negative/over-cap/non-integer/non-number rejection.

Links

…p-club#164)

Adds a per-invocation `--timeout <seconds>` flag to `swamp datastore sync`
that overrides `CustomDatastoreConfig.syncTimeoutMs` and
`SWAMP_DATASTORE_SYNC_TIMEOUT_MS` for a single invocation. Capped at 21600s
(6h) to cover legitimate long-haul scenarios (initial seeding from a slow
link, bulk recovery after an outage, 100k-file repo moves) while rejecting
fat-finger misconfigs. The env var stays uncapped as the shell-session
escape hatch.

Enriches `SyncTimeoutError` to list every available remedy inline —
`--timeout`, the env var, extension update, stuck-lock release — so users
get actionable guidance without chasing docs. Wording is version-free
(points at "the latest extension") so the message stays stable across
extension releases.

`resolveSyncTimeoutMs` gains an optional `overrideMs` parameter at
highest precedence; `createDatastoreSyncDeps` threads the CLI flag
through. Implicit-sync paths (write commands via `registerDatastoreSync`)
are unchanged — `--timeout` is explicitly scoped to `swamp datastore
sync`, since the env var already covers the implicit case.

Companion to the `@swamp/s3-datastore` fingerprint fast-path fix; see
lab/164 for the full plan. Either PR may land first.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Clean, well-scoped PR. The layer separation is solid — CLI validates input, libswamp threads the override, and the domain resolves the timeout with a clear 4-tier precedence chain. 19 new tests cover the happy paths and edge cases thoroughly.

Blocking Issues

None.

Suggestions

  1. CreateDatastoreSyncDepsOptions not re-exported from barrelsrc/libswamp/mod.ts (line 946–953) exports createDatastoreSyncDeps but not the new CreateDatastoreSyncDepsOptions interface. The CLI command sidesteps this by passing an inline object literal, but any future consumer that wants to type an options variable would need to import from the internal path ./datastores/sync.ts, violating the libswamp import boundary. Consider adding type CreateDatastoreSyncDepsOptions to the barrel export.

  2. JSDoc verbosity — The multi-paragraph JSDoc on SYNC_TIMEOUT_CLI_MAX_SECONDS (datastore_sync.ts:42–52) and parseTimeoutFlag (datastore_sync.ts:55–62) go beyond the project's "default to no comments" convention. The first sentence of each captures the essential "why"; the rest could live in the PR description or design doc. Not blocking since the comments are accurate and the constant/function names are already descriptive.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. src/libswamp/datastores/sync.ts:67-75CreateDatastoreSyncDepsOptions not re-exported from mod.ts
    The createDatastoreSyncDeps function is re-exported from src/libswamp/mod.ts (line 947), but the new CreateDatastoreSyncDepsOptions interface is not. Any consumer outside the CLI that wants to type the options bag (e.g., a programmatic caller or a future test that constructs deps directly) would need to import from the internal path src/libswamp/datastores/sync.ts, which violates the project's barrel-import convention (CLAUDE.md: "CLI commands and presentation renderers must import libswamp types and functions from src/libswamp/mod.ts"). The CLI itself is fine because it passes an object literal that satisfies the type structurally — so no breakage today, just API surface hygiene.

Low

  1. src/cli/commands/datastore_sync.ts:114-116 — Cliffy :integer parser may truncate floats silently.
    parseTimeoutFlag correctly rejects 1.5, but Cliffy's :integer type hint may silently truncate 1.51 before the value reaches the action handler, meaning the !Number.isInteger(raw) guard never fires for float-like CLI input. This is defense-in-depth territory — the code is correct for all values that actually arrive — but worth noting that the test at line 80 (parseTimeoutFlag(1.5)) exercises a path that may be unreachable from actual CLI invocation. No fix needed; just awareness.

  2. src/domain/datastore/datastore_sync_service.ts:96-98cause set conditionally, override readonly declaration may shadow Error.cause for undefined callers.
    When options is omitted, this.cause is never explicitly assigned, relying on undefined as the implicit value of the optional property. This is correct in current TypeScript/V8 semantics but is subtly different from Error's own cause mechanism (which expects { cause } in the options bag passed to super()). No practical issue — UserError doesn't forward options to super() anyway — but a future refactor that tries to use super(msg, { cause }) would need to update this path.

Verdict

PASS — Solid, well-scoped PR. The validation logic is thorough, the precedence chain is correct and backward-compatible (the existing resolveSyncTimeoutMs(datastoreConfig) call in repo_context.ts:338 is unaffected), the error message is actionable, and the test coverage hits the right boundaries. The medium finding (missing barrel re-export) is API hygiene, not a correctness issue.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. SyncTimeoutError remedy 1 is misleading for implicit-sync contexts.
    The SyncTimeoutError fires from flushDatastoreSync() after every command (see src/cli/mod.ts:1014), not just swamp datastore sync. A user running swamp model run @my-model/method who hits the timeout will see • Rerun with --timeout <seconds> as the first remedy — but that flag doesn't exist on their command. They'll try it, get "unknown flag", and have to read further to find the env var.

    The env var remedy (item 2) is the correct escape hatch for implicit syncs, but it's buried after the inapplicable flag suggestion. Consider either swapping the order (env var first, then flag with a note that it applies only to swamp datastore sync) or adding a parenthetical like: • Rerun with swamp datastore sync --timeout before your command (explicit syncs only). Minor but would save real confusion at 3am during a migration.

  2. --timeout flag description is verbose relative to other flags.
    The description is 3 sentences including precedence details: "Override the per-direction sync timeout for this invocation (seconds, max 21600). Wins over SWAMP_DATASTORE_SYNC_TIMEOUT_MS and per-datastore config. Preferred escape hatch for one-off large syncs." Other flags in the codebase (--pull, --push, --version) are single short phrases. The precedence context is valuable but belongs in --help --verbose or the example comment rather than the primary description. Suggested trim: "Per-invocation sync timeout in seconds (max 21600). Overrides config and SWAMP_DATASTORE_SYNC_TIMEOUT_MS."

Verdict

PASS — the --timeout flag is well-designed and discoverable, validation produces clean UserError messages with correct escape-hatch guidance, and the enriched SyncTimeoutError is a meaningful improvement over the old single-line message. The example in --help and the 6h cap with env-var escape hatch are good ergonomic choices.

@stack72 stack72 merged commit 44543d3 into main Apr 24, 2026
10 checks passed
@stack72 stack72 deleted the datastore-sync-timeout-164 branch April 24, 2026 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant