Skip to content

fix: scope findBySpec() to current workflow run (#987)#988

Merged
stack72 merged 1 commit intomainfrom
keeb/987-scope-findbyspec-to-workflow-run
Mar 31, 2026
Merged

fix: scope findBySpec() to current workflow run (#987)#988
stack72 merged 1 commit intomainfrom
keeb/987-scope-findbyspec-to-workflow-run

Conversation

@keeb
Copy link
Copy Markdown
Contributor

@keeb keeb commented Mar 31, 2026

Summary

  • Tag data written during workflow execution with workflowRunId so each run's output is identifiable
  • Auto-scope data.findBySpec() to the current workflow run when workflowRunId is present in the expression context, preventing stale data from previous runs leaking into forEach iteration
  • data.findByTag() intentionally left globally scoped as an escape hatch for cross-run queries
  • Document findBySpec and findByTag run-scoping behavior in design/expressions.md

Closes #987

Test Plan

  • Added unit tests verifying findBySpec filters by workflowRunId when set and returns all data when unset
  • All 3903 existing tests pass
  • Reproduction: swamp workflow run eztv-check in swamp-media no longer pulls stale episodes from previous discover-and-download runs

…s-run data (#987)

data.findBySpec() in forEach expressions returned ALL data ever written to
a model across all workflow runs. When multiple workflows shared a model
(e.g. general-dedup), forEach would iterate over stale episodes from
previous runs of other workflows.

Two changes fix this:

1. Tag data written during workflow execution with workflowRunId
   (execution_service.ts) — the run ID was already available in
   StepExecutionContext but was not propagated to data tags.

2. When workflowRunId is present in the expression context, findBySpec()
   filters results to only return data tagged with that run ID
   (model_resolver.ts). Outside a workflow context, findBySpec() continues
   to return all data globally.

data.findByTag() is intentionally left unscoped as a general-purpose
escape hatch for cross-run queries.

Closes #987

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Blocking Issues

None.

Suggestions

  1. Edge case documentation: When workflowRunId is set on the context, data items that lack a workflowRunId tag (e.g., written outside any workflow) are silently filtered out by findBySpec. This is clearly the intended behavior per the PR description and design docs, but it might be worth a brief inline comment noting this implication for future readers — something like "data without a workflowRunId tag is also excluded when scoped to a run."

Overall this is a clean, well-scoped fix. The filtering logic is minimal and correctly placed in the resolver. Tests cover both scoped and unscoped paths, including run switching. Design docs are clear about the intentional asymmetry between findBySpec (run-scoped) and findByTag (global). DDD principles are respected — the orchestration layer sets the scope, the resolver applies it, and no new abstractions are introduced beyond what's needed.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. Missing test for data without workflowRunId tag when context has workflowRunId set (model_resolver_test.ts): The second test ("returns all data when workflowRunId is not set") creates data items both with and without a workflowRunId tag, but never sets ctx.workflowRunId on the context. There is no test that verifies what happens when ctx.workflowRunId is set but a data item lacks the workflowRunId tag entirely. In the current implementation (model_resolver.ts:632), versionData.tags["workflowRunId"] would be undefined, so undefined !== "run-1" is true, and the item gets filtered out. This is likely the desired behavior (data written outside the current run is excluded), but there's no test asserting it explicitly. If someone later changed the filter to e.g. only skip when the tag exists AND mismatches, they wouldn't break any test. Consider adding a case where workflowRunId is set on context and one data item has no workflowRunId tag — assert that item is excluded.

Low

  1. findBySpec reads context.workflowRunId via closure over a mutable object (model_resolver.ts:614): The findBySpec closure reads context.workflowRunId from the shared context object reference. This works correctly because buildContext() returns the same object, and execution_service.ts:1146 mutates it before execution begins. However, if buildContext() were ever refactored to return a shallow copy or frozen object, this mutation pattern would silently break (the context would have workflowRunId set but the closure would read from the original). This is a coupling that's fine today but worth noting.

Verdict

PASS — The implementation is clean and correct. The findBySpec run-scoping logic is simple, the mutation of expressionContext.workflowRunId correctly propagates into the closure, findByTag is intentionally left globally scoped, and the workflowRunId tag is consistently written via StepExecutionContext. The only suggestion is a minor test gap (Medium #1) to lock in the filtering behavior for data items that lack the workflowRunId tag.

@stack72 stack72 merged commit b41e762 into main Mar 31, 2026
10 checks passed
@stack72 stack72 deleted the keeb/987-scope-findbyspec-to-workflow-run branch March 31, 2026 05:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

forEach data.findBySpec returns stale data from other workflow runs

2 participants