fix(datastore): signal cache writes to sync service fast path#1225
fix(datastore): signal cache writes to sync service fast path#1225
Conversation
The s3/gcs zero-diff fast path in DatastoreSyncService keeps a local-dirty watermark that it flips during its own pushFile-equivalent. Swamp core writes into the cache directly from the persistence repositories (UnifiedDataRepository, YamlOutputRepository, YamlWorkflowRunRepository, YamlEvaluatedDefinitionRepository), which bypasses that hook — the next pushChanged short-circuits past the write and the data lives only in the local cache until eviction. Add a required markDirty() method to DatastoreSyncService. Every public mutation on the four datastore-tier repositories calls it before any cache write begins, so a crash mid-write still leaves the watermark dirty. A single sync service instance threads through requireInitializedRepoUnlocked → acquireModelLocks and through the serve context (OpenServerState, ConnectionContext, WebhookServiceDeps), so an implementation that caches the sidecar flag in memory stays coherent across the markDirty hook and the pull/push fast-path read. Document the contract in design/datastores.md and mirror the interface change in packages/testing/datastore_types.ts. Filesystem datastores wire no service, so notifyDirty is a no-op for them. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
CLI UX Review
Blocking
None.
Suggestions
None.
Verdict
PASS — this PR makes no user-visible changes.
Every changed file is internal plumbing: the markDirty() hook is wired through four persistence repositories, threaded via requireInitializedRepoUnlocked / acquireModelLocks into the CLI commands and serve layer, and stubbed in tests. There are no new flags, no changes to help text, no changes to output format (log or JSON), and no changes to error messages. The design doc update (design/datastores.md) is developer-facing only. From the perspective of a CLI user, this fix is invisible — it only affects whether S3/GCS cache writes are correctly flushed on the next push.
There was a problem hiding this comment.
Code Review
Clean, well-structured fix for a real data-loss bug. The design is sound: a thin MarkDirtyHook decouples repositories from the full DatastoreSyncService, and the shared sync-service instance threaded through requireInitializedRepoUnlocked → acquireModelLocks → serve context keeps the in-memory watermark coherent.
Blocking Issues
None.
Suggestions
-
Test gap for
appendandremoveLatestMarker— themutations call markDirty before writingtest inunified_data_repository_test.tscoverssave,allocateVersion/finalizeVersion,rename,delete, andcollectGarbage, butappendandremoveLatestMarkerare omitted despite being listed in thedesign/datastores.mdcontract. The code is wired correctly (lines 826 and 1012), but pinning them in the test would close the coverage gap. -
MarkDirtyHookdrops theoptionsparameter —DatastoreSyncService.markDirtyacceptsoptions?: DatastoreSyncOptions(carrying anAbortSignal), but theMarkDirtyHooktype alias is() => Promise<void>and the wiring callssyncService!.markDirty()without forwarding options. Since markDirty is documented as "idempotent and cheap" this is likely fine, but the asymmetry could confuse extension authors who see the parameter on the interface but never receive a signal. Consider aligning the two signatures (either dropoptionsfrom the interface or thread it through the hook).
There was a problem hiding this comment.
Adversarial Review
Traced every code path across 28 changed files. The fix is structurally sound — single sync-service instance semantics are maintained correctly through all wiring paths (locked, unlocked, serve, open, webhook, connection, retry), and every public mutation method on all four datastore-tier repositories calls notifyDirty() before the first cache write.
Critical / High
None found.
Medium
markDirtyhas no timeout or cancellation path —MarkDirtyHookis() => Promise<void>with noAbortSignal. TheDatastoreSyncService.markDirtyinterface acceptsoptions?: DatastoreSyncOptions(which includes asignal), but the hook wrapper discards it. If a misbehaving extension does I/O inmarkDirty()and hangs, every repository mutation hangs with no timeout — unlikepullChanged/pushChangedwhich getPromise.raceprotection from the coordinator. The contract says "must be idempotent and cheap" but there's no enforcement. Not blocking because: (a) the typical implementation flips a boolean, (b) repository methods don't carry a signal to forward anyway, and (c) the design doc makes the obligation clear. But worth noting for extension SDK hardening.
Low
-
Non-null assertion on
letbinding in closure (src/cli/repo_context.ts:414,:543) —syncService ? () => syncService!.markDirty() : undefinedcaptures theletbinding. Currently safe because nothing reassignssyncServiceafter the closure is created, but fragile under future edits. A localconstcopy (const svc = syncService; markDirty: svc ? () => svc.markDirty() : undefined) would let TypeScript narrow without the!. Cosmetic — no runtime risk today. -
collectGarbagemarks dirty even when GC finds nothing to remove (src/infrastructure/persistence/unified_data_repository.ts:1558) — whendryRunisfalse,notifyDirty()fires before scanning. If the scan finds zero versions to collect, the watermark was dirtied unnecessarily. Harmless per the idempotency contract, but a theoretical extra push cycle for an empty GC run.
Verification performed
- Confirmed all mutation methods (
save,append,delete,rename,allocateVersion,finalizeVersion,removeLatestMarker,collectGarbage,clearAll,deleteAllByWorkflowId) callnotifyDirty()before their first write. - Confirmed all read methods (
findByName,findAllForModel,listVersions,getContent,findById,findAll,findAllGlobal, sync variants) do NOT callnotifyDirty(). - Confirmed
deleteAllByWorkflowIdcorrectly skipsnotifyDirty()when count is 0 (no-op path). - Confirmed
collectGarbageskipsnotifyDirty()for dry-run. - Confirmed the
acquireModelLocksretry path at line 803 propagatescustomSyncService(preserving single-instance semantics across retries). - Confirmed the null/undefined dance in
OpenServerState(syncService: DatastoreSyncService | null) correctly converts toundefinedvia?? undefinedwhen passed toacquireModelLocks. - Confirmed all 12+ test stub sync services are updated with
markDirty: () => Promise.resolve(). - Confirmed the testing package compat test structurally verifies the new method.
Verdict
PASS — Clean, well-structured fix. The markDirty contract is correctly implemented across all four repository types, single-instance semantics are preserved through every wiring path, and tests pin both positive and negative cases. No blocking issues.
swamp core systeminit/swamp#1225 made markDirty() a required third method on DatastoreSyncService. Implementing it as a no-op: pushChanged always walks the cache, so there's no fast-path watermark to invalidate. The hook is in place for a future zero-diff optimization. Bumps root manifest 2026.04.24.1 → 2026.04.24.2 and extension manifest 0.2.0 → 0.2.1. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Summary
The s3/gcs zero-diff fast path in
DatastoreSyncServicemaintains alocal-dirty watermark that it flips during its own
pushFile-equivalent.Swamp core writes into the cache directly from the persistence
repositories (
UnifiedDataRepository,YamlOutputRepository,YamlWorkflowRunRepository,YamlEvaluatedDefinitionRepository), whichbypasses that hook — the next
pushChangedshort-circuits past thewrite and the data lives only in the local cache until eviction. Single
writer setups against a low-traffic bucket are most exposed; high
traffic masks the bug because unrelated remote mutations invalidate the
sidecar.
Fix
markDirty()method toDatastoreSyncService(mirroredin
packages/testing/datastore_types.ts).notifyDirty()before any cache write begins, so a crash mid-writestill leaves the watermark dirty — per-method (not per-atomicWrite)
keeps the repositories clean and the hook is idempotent anyway.
requireInitializedRepoUnlocked→acquireModelLocksand through theserve context (
OpenServerState,ConnectionContext,WebhookServiceDeps), so an implementation that caches the sidecarflag in memory stays coherent across the
markDirtyhook and thepull/push fast-path read. The `acquireModelLocks` param is optional
for backward compatibility; every caller in the repo now passes one.
`design/datastores.md`.
them.
Extension follow-up
The required interface change surfaces as a compile-time signal to
extension authors (`@swamp/s3-datastore`, `@swamp/gcs-datastore`,
swamp-club/166) that they need to implement `markDirty()` — trivial:
it's the same primitive as their internal `pushFile` hook's dirty-flip.
Test Plan
public mutation, including the negative cases (reads, dry-run GC,
no-op delete) where it must not fire.
sync services across tests updated to include the new field.
`@systeminit/swamp-testing` mirror still matches core.
`deno run test` (4854 passed, 0 failed, 1 ignored),
`deno run compile` all clean.
🤖 Generated with Claude Code