Fix tool denial context lost across durable workflow steps#3009
Conversation
In durable execution, each step creates a fresh Agent instance, so taskDenialRedirects accumulated during executeToolStep were lost before the next callLlmStep. This meant the LLM had no awareness that a tool was denied and the generateTaskHandler delegation note from #2500 never fired in durable mode. Carry denial state in the workflow loop, pass it into callLlmStep, and give the LLM an explicit post-approval message when denials occurred. Also improve conversation history format for denied tools.
🦋 Changeset detectedLatest commit: 7ca0d0c The changes in this PR will be included in the next version bump. This PR includes changesets to release 10 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
TL;DR — Fixes tool denial context being silently lost between durable workflow steps. Each step creates a fresh Key changes
Summary | 4 files | 2 commits | base: Denial context threaded through durable workflow loop
A new
Denial-aware post-approval prompt
The denial summary is sanitized (newlines replaced, truncated to 200 chars per reason) to prevent prompt injection from user-supplied denial reasons. Improved denied tool result formatting
|
There was a problem hiding this comment.
Clean fix that correctly solves the durable-mode denial context loss. The data flow — recordDenial → executeToolStep extracts denial → workflow loop accumulates → callLlmStep rehydrates agent context — is sound. One minor concern below about stale denials accumulating across rounds, and one nit.
Claude Opus | 𝕏
| streamNamespace, | ||
| taskId, | ||
| isPostApproval, | ||
| denialRedirects: denialRedirects.length > 0 ? denialRedirects : undefined, |
There was a problem hiding this comment.
denialRedirects is append-only and never cleared after being passed to callLlmStep. In a multi-round scenario (deny tool A → LLM retries → deny tool B → LLM responds), the second callLlmStep receives [{A}, {B}] and tells the LLM about denial A again even though it already processed it.
This is likely harmless in practice (the LLM context already contains the prior exchange), but consider clearing the array after passing it — or, more precisely, only passing denials collected since the last callLlmStep:
const pendingDenials = denialRedirects.splice(0);
callLlmStep({ ..., denialRedirects: pendingDenials.length > 0 ? pendingDenials : undefined });This keeps the post-approval user message accurate to what just happened rather than replaying the full history. Low severity — the current behavior won't break anything, but it could confuse the LLM with duplicate denial context on repeated rounds.
| logger.info({ toolName, toolCallId }, 'executeToolStep: tool executed successfully'); | ||
| return { type: 'completed' as const }; | ||
| const denials = agent.getTaskDenialRedirects(); | ||
| const denial = denials.length > 0 ? denials[denials.length - 1] : undefined; |
There was a problem hiding this comment.
This grabs the last element, relying on the fact that a fresh agent has at most one denial. The assumption holds today since buildAgentForStep creates a new Agent with an empty taskDenialRedirects per step. Worth noting: if executeToolStep is ever changed to reuse an agent across tool calls within a single step, this would silently drop earlier denials. A find matching on toolCallId would be more robust, but not urgent given the current architecture.
There was a problem hiding this comment.
PR Review Summary
(4) Total Issues | Risk: Medium
🟠⚠️ Major (1) 🟠⚠️
🟠 1) system Missing test coverage for durable denial propagation
Issue: The core fix in this PR — accumulating denialRedirects[] across workflow steps, hydrating them on fresh Agent instances, and generating denial-aware post-approval messages — has zero automated test coverage. The test plan in the PR description is manual testing only.
Why: This is a bug fix for a critical user-facing issue (denial context being lost). Without tests:
- Regressions to this state propagation could go undetected in future refactors
- The exact bug this PR fixes could silently return
- The new conditional branching in
callLlmStep(denial-aware vs generic message) is untested
Key untested paths:
agentExecution.ts:53,68,133-135— denial accumulation in workflow loopagentExecutionSteps.ts:513-515— hydration ofdenialRedirectsintoagent.runContext.taskDenialRedirectsagentExecutionSteps.ts:543-555— post-approval message generation with denial summaryagentExecutionSteps.ts:1037-1041— extraction of denial fromagent.getTaskDenialRedirects()
Fix: Add test coverage at minimum for:
// agentExecution.test.ts
it('should accumulate denial redirects across tool execution steps', async () => {
// Mock executeToolStep to return { type: 'completed', denial: {...} }
// Verify callLlmStep receives denialRedirects array
});
it('should generate denial-aware message when denialRedirects present', async () => {
// Verify userParts contains denial summary, not generic continuation
});Refs:
- Existing non-durable denial tests — pattern to follow
🟡 Minor (3) 🟡
🟡 1) agentExecutionSteps.ts:69 Type duplication: DenialRedirect already exists
Issue: The new DenialRedirect type duplicates the inline type at agent-types.ts:267 for taskDenialRedirects: Array<{ toolName: string; toolCallId: string; reason: string }>.
Why: Type duplication creates maintenance burden. If the denial shape evolves, both locations must be updated.
Fix: Extract a shared type in agent-types.ts:
export type DenialRedirect = { toolName: string; toolCallId: string; reason: string };
// Then in AgentRunContext:
taskDenialRedirects: DenialRedirect[];Refs:
🟡 2) agentExecutionSteps.ts:547-549 Denial message format diverges from non-durable path
See inline comment.
🟡 3) tool-result-for-conversation-history.ts:36 TOOL_CALL_ID header emoji inconsistent
See inline comment.
Inline Comments:
- 🟡 Minor:
agentExecutionSteps.ts:547-549Denial message format diverges from non-durable path - 🟡 Minor:
tool-result-for-conversation-history.ts:36TOOL_CALL_ID header emoji inconsistent
💭 Consider (1) 💭
💭 1) scope Missing changeset
Issue: Per AGENTS.md, bug fixes to agents-api require a changeset for version bumping and changelog generation. The changeset bot has flagged this.
Fix: Run pnpm bump patch --pkg agents-api "Fix tool denial context lost across durable workflow steps"
💡 APPROVE WITH SUGGESTIONS
Summary: This is a well-targeted fix for a real bug in durable execution mode. The implementation correctly propagates denial context across workflow step boundaries and provides meaningful feedback to the LLM. The code changes are sound — the main gaps are (1) missing test coverage for the core fix, and (2) minor consistency issues with the non-durable path that can be addressed with the inline suggestions. Adding at least a basic test for the denial accumulation logic would significantly reduce regression risk.
Discarded (2)
| Location | Issue | Reason Discarded |
|---|---|---|
agentExecutionSteps.ts:72 |
Optional denial? field design could lead to missed handling |
LOW confidence — current workflow structure executes one tool per step, so the denial?: DenialRedirect pattern is pragmatically acceptable. Not a type safety violation. |
agentExecutionSteps.ts:550 |
Post-approval message wording differs from non-durable delegation note | MEDIUM confidence — the messages serve different purposes: LLM prompt vs user-facing artifact. Intentional divergence is reasonable. |
Reviewers (4)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
pr-review-standards |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
pr-review-types |
2 | 0 | 0 | 0 | 0 | 0 | 2 |
pr-review-consistency |
4 | 1 | 0 | 0 | 2 | 0 | 1 |
pr-review-tests |
5 | 1 | 1 | 0 | 0 | 0 | 0 |
| Total | 11 | 2 | 1 | 0 | 2 | 0 | 3 |
Note: Type duplication finding raised by both types and consistency reviewers — counted once in Main.
| const redirectSummary = denialRedirects | ||
| .map((d) => `- ${d.toolName}: ${sanitize(d.reason)}`) | ||
| .join('\n'); |
There was a problem hiding this comment.
🟡 Minor: Denial message format diverges from non-durable path
Issue: The non-durable path in generateTaskHandler.ts:599 includes toolCallId in the denial summary: - ${d.toolName} (${d.toolCallId}): ${sanitize(d.reason)}. This durable path omits it.
Why: Including toolCallId provides traceability for debugging. Consistency between execution paths helps maintain predictable behavior.
Fix:
| const redirectSummary = denialRedirects | |
| .map((d) => `- ${d.toolName}: ${sanitize(d.reason)}`) | |
| .join('\n'); | |
| const redirectSummary = denialRedirects | |
| .map((d) => `- ${d.toolName} (${d.toolCallId}): ${sanitize(d.reason)}`) | |
| .join('\n'); |
Refs:
| `## Tool: ${toolName} [DENIED BY USER]`, | ||
| '', | ||
| `### 🔧 TOOL_CALL_ID: ${toolCallId}`, | ||
| `### TOOL_CALL_ID: ${toolCallId}`, |
There was a problem hiding this comment.
🟡 Minor: TOOL_CALL_ID header format inconsistent within file
Issue: The denied branch now uses ### TOOL_CALL_ID without the wrench emoji, while the successful branch at line 74 uses ### 🔧 TOOL_CALL_ID. This creates inconsistent formatting within the same function.
Why: Inconsistent formatting makes output unpredictable and harder to parse.
Fix:
| `### TOOL_CALL_ID: ${toolCallId}`, | |
| `### 🔧 TOOL_CALL_ID: ${toolCallId}`, |
Refs:
There was a problem hiding this comment.
emojis should be removed throughout
Preview URLsUse these stable preview aliases for testing this PR:
These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find. Raw Vercel deployment URLs
|
Ito Test Report ✅5 test cases ran. 5 passed. The unified QA run passed completely with 5 of 5 included test cases successful and 0 failures, and two additional code-first verification passes reported no executable cases. Most importantly, durable approval/denial handling behaved as intended in deterministic non-production setups: denial continuation preserved explicit reasons and deterministic fallback text, denial context accumulated across multiple denied calls, approved=true flows stayed free of denial language, duplicate or conflicting approval submissions remained idempotent and stable, hostile denial payloads were rendered as inert text without injection, and code review found no new code-supported application defect (with one issue attributed to environment/model-output instability rather than logic). ✅ Passed (5)Commit: Tell us how we did: Give Ito Feedback |
a74daba
into
feat/durable-delegated-tool-approval
Ito Test Report ✅6 test cases ran. 6 passed. Overall, the unified QA run passed with 6 of 6 executed test cases successful and 0 failures, and several additional verification passes completed with no includable test cases after report rules were applied. The key validated behaviors were durable run/API stability (idempotent repeated approvals, preserved multi-round denial context, safe fallback for missing denial reasons, support for long/multiline denial text, and execution-stream reconnection by execution ID) plus successful mobile viewport deny-and-follow-up continuity, with all checks performed in local non-production environments using auth/session fallback or mobile bypass paths and no plausible production-code defect identified. ✅ Passed (6)Commit: Tell us how we did: Give Ito Feedback |
* feat: durable delegated tool approval flow When a delegated sub-agent requires tool approval in durable mode, the approval now flows through the workflow hook system instead of the in-memory pub/sub bus. The parent agent detects `durable-approval-required` artifacts from sub-agents, surfaces them as SSE approval events, and forwards the user's decision back via `delegatedToolApproval` context on re-execution. Adds `isPostApproval` flag to prevent the workflow loop from re-sending the original user message after approval. Includes validation of approval artifact fields, SSE error handling for delegated approval streaming, logging for suppressed errors in the durable approval catch path, and type-safe metadata construction for delegation. * style: auto-format with biome * fix: use 'as const' for tool_calls type literal to satisfy typecheck * Fix tool denial context lost across durable workflow steps (#3009) * Address review: extract tool prefix constants, add shared types, remove redundant localhost fallback * style: auto-format with biome --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary
taskDenialRedirectsstate through the durable workflow loop so denial context survives across step boundaries (each step creates a fresh Agent instance)[DENIED BY USER]label, input args, andDenial Reasonsection