fix: scope findBySpec() to current workflow run (#987)#988
Conversation
…s-run data (#987) data.findBySpec() in forEach expressions returned ALL data ever written to a model across all workflow runs. When multiple workflows shared a model (e.g. general-dedup), forEach would iterate over stale episodes from previous runs of other workflows. Two changes fix this: 1. Tag data written during workflow execution with workflowRunId (execution_service.ts) — the run ID was already available in StepExecutionContext but was not propagated to data tags. 2. When workflowRunId is present in the expression context, findBySpec() filters results to only return data tagged with that run ID (model_resolver.ts). Outside a workflow context, findBySpec() continues to return all data globally. data.findByTag() is intentionally left unscoped as a general-purpose escape hatch for cross-run queries. Closes #987 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
- Edge case documentation: When
workflowRunIdis set on the context, data items that lack aworkflowRunIdtag (e.g., written outside any workflow) are silently filtered out byfindBySpec. This is clearly the intended behavior per the PR description and design docs, but it might be worth a brief inline comment noting this implication for future readers — something like "data without a workflowRunId tag is also excluded when scoped to a run."
Overall this is a clean, well-scoped fix. The filtering logic is minimal and correctly placed in the resolver. Tests cover both scoped and unscoped paths, including run switching. Design docs are clear about the intentional asymmetry between findBySpec (run-scoped) and findByTag (global). DDD principles are respected — the orchestration layer sets the scope, the resolver applies it, and no new abstractions are introduced beyond what's needed.
There was a problem hiding this comment.
Adversarial Review
Critical / High
None found.
Medium
- Missing test for data without
workflowRunIdtag when context hasworkflowRunIdset (model_resolver_test.ts): The second test ("returns all data when workflowRunId is not set") creates data items both with and without aworkflowRunIdtag, but never setsctx.workflowRunIdon the context. There is no test that verifies what happens whenctx.workflowRunIdis set but a data item lacks theworkflowRunIdtag entirely. In the current implementation (model_resolver.ts:632),versionData.tags["workflowRunId"]would beundefined, soundefined !== "run-1"istrue, and the item gets filtered out. This is likely the desired behavior (data written outside the current run is excluded), but there's no test asserting it explicitly. If someone later changed the filter to e.g. only skip when the tag exists AND mismatches, they wouldn't break any test. Consider adding a case whereworkflowRunIdis set on context and one data item has noworkflowRunIdtag — assert that item is excluded.
Low
findBySpecreadscontext.workflowRunIdvia closure over a mutable object (model_resolver.ts:614): ThefindBySpecclosure readscontext.workflowRunIdfrom the sharedcontextobject reference. This works correctly becausebuildContext()returns the same object, andexecution_service.ts:1146mutates it before execution begins. However, ifbuildContext()were ever refactored to return a shallow copy or frozen object, this mutation pattern would silently break (the context would haveworkflowRunIdset but the closure would read from the original). This is a coupling that's fine today but worth noting.
Verdict
PASS — The implementation is clean and correct. The findBySpec run-scoping logic is simple, the mutation of expressionContext.workflowRunId correctly propagates into the closure, findByTag is intentionally left globally scoped, and the workflowRunId tag is consistently written via StepExecutionContext. The only suggestion is a minor test gap (Medium #1) to lock in the filtering behavior for data items that lack the workflowRunId tag.
Summary
workflowRunIdso each run's output is identifiabledata.findBySpec()to the current workflow run whenworkflowRunIdis present in the expression context, preventing stale data from previous runs leaking intoforEachiterationdata.findByTag()intentionally left globally scoped as an escape hatch for cross-run queriesfindBySpecandfindByTagrun-scoping behavior indesign/expressions.mdCloses #987
Test Plan
findBySpecfilters byworkflowRunIdwhen set and returns all data when unsetswamp workflow run eztv-checkin swamp-media no longer pulls stale episodes from previousdiscover-and-downloadruns