Skip to content

Fix tool denial context lost across durable workflow steps#3009

Merged
anubra266 merged 2 commits intofeat/durable-delegated-tool-approvalfrom
feat/durable-denial-context
Apr 6, 2026
Merged

Fix tool denial context lost across durable workflow steps#3009
anubra266 merged 2 commits intofeat/durable-delegated-tool-approvalfrom
feat/durable-denial-context

Conversation

@anubra266
Copy link
Copy Markdown
Contributor

@anubra266 anubra266 commented Apr 3, 2026

Summary

  • Carry taskDenialRedirects state through the durable workflow loop so denial context survives across step boundaries (each step creates a fresh Agent instance)
  • Give the LLM an explicit post-approval message listing what was denied and why, instead of a generic "continue the conversation"
  • Format denied tool results in conversation history with [DENIED BY USER] label, input args, and Denial Reason section

In durable execution, each step creates a fresh Agent instance, so
taskDenialRedirects accumulated during executeToolStep were lost before
the next callLlmStep. This meant the LLM had no awareness that a tool
was denied and the generateTaskHandler delegation note from #2500 never
fired in durable mode.

Carry denial state in the workflow loop, pass it into callLlmStep, and
give the LLM an explicit post-approval message when denials occurred.
Also improve conversation history format for denied tools.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 3, 2026

🦋 Changeset detected

Latest commit: 7ca0d0c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages
Name Type
@inkeep/agents-api Patch
@inkeep/agents-manage-ui Patch
@inkeep/agents-cli Patch
@inkeep/agents-core Patch
@inkeep/agents-email Patch
@inkeep/agents-mcp Patch
@inkeep/agents-sdk Patch
@inkeep/agents-work-apps Patch
@inkeep/ai-sdk-provider Patch
@inkeep/create-agents Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Apr 6, 2026 7:30pm
agents-docs Ready Ready Preview, Comment Apr 6, 2026 7:30pm
agents-manage-ui Ready Ready Preview, Comment Apr 6, 2026 7:30pm

Request Review

@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog Bot commented Apr 3, 2026

TL;DR — Fixes tool denial context being silently lost between durable workflow steps. Each step creates a fresh Agent instance, so denial redirects recorded in one step were gone by the next. This PR threads that state through the workflow loop and gives the LLM an explicit denial-aware prompt instead of a generic continuation message.

Key changes

  • Thread DenialRedirect state across the durable workflow loopexecuteToolStep now returns denial info, accumulated in the orchestrator and passed into callLlmStep so context survives agent instance boundaries.
  • Denial-aware post-approval prompt for the LLM — When denials are present, the user-part message explicitly lists what was denied and why, replacing the generic "continue the conversation" text.
  • Improve denied tool result formatting in conversation history — Denied results now display a [DENIED BY USER] header, include the original tool input, and label the output as Denial Reason.
  • Add changeset — Patch bump for @inkeep/agents-api.

Summary | 4 files | 2 commits | base: feat/durable-delegated-tool-approvalfeat/durable-denial-context


Denial context threaded through durable workflow loop

Before: executeToolStep recorded denials on a short-lived Agent instance that was discarded at step completion — the next callLlmStep always saw an empty taskDenialRedirects array.
After: executeToolStep returns a DenialRedirect payload, the orchestrator accumulates these in a denialRedirects[] array across iterations, and callLlmStep hydrates them onto the new agent's runContext.taskDenialRedirects.

A new DenialRedirect type captures toolName, toolCallId, and reason. ExecuteToolResult is extended to optionally carry this when type === 'completed'.

Why not persist denials in the database instead? The durable workflow already manages loop state (iteration count, approval round, post-approval flag) as local variables threaded across steps. Denial redirects follow the same pattern — lightweight in-memory accumulation that doesn't need persistence beyond the current workflow run.

agentExecutionSteps.ts · agentExecution.ts


Denial-aware post-approval prompt

Before: After tool approval/denial, the LLM always received: "Continue the conversation. The tool results above contain the information needed to respond to the user."
After: When denials exist, the LLM receives a structured summary listing each denied tool and its reason, with an instruction to acknowledge the redirect.

The denial summary is sanitized (newlines replaced, truncated to 200 chars per reason) to prevent prompt injection from user-supplied denial reasons.

agentExecutionSteps.ts


Improved denied tool result formatting

Before: Denied tool results in conversation history showed a plain ## Tool: name header with the denial reason under a generic ### Output section.
After: Header is ## Tool: name [DENIED BY USER], includes a ### Input section with the original arguments, and labels the denial output as ### Denial Reason with a fallback message.

tool-result-for-conversation-history.ts

Pullfrog  | View workflow run | Triggered by Pullfrog | Using Claude Opus𝕏

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean fix that correctly solves the durable-mode denial context loss. The data flow — recordDenialexecuteToolStep extracts denial → workflow loop accumulates → callLlmStep rehydrates agent context — is sound. One minor concern below about stale denials accumulating across rounds, and one nit.

Pullfrog  | View workflow run | Using Claude Opus𝕏

streamNamespace,
taskId,
isPostApproval,
denialRedirects: denialRedirects.length > 0 ? denialRedirects : undefined,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

denialRedirects is append-only and never cleared after being passed to callLlmStep. In a multi-round scenario (deny tool A → LLM retries → deny tool B → LLM responds), the second callLlmStep receives [{A}, {B}] and tells the LLM about denial A again even though it already processed it.

This is likely harmless in practice (the LLM context already contains the prior exchange), but consider clearing the array after passing it — or, more precisely, only passing denials collected since the last callLlmStep:

const pendingDenials = denialRedirects.splice(0);
callLlmStep({ ..., denialRedirects: pendingDenials.length > 0 ? pendingDenials : undefined });

This keeps the post-approval user message accurate to what just happened rather than replaying the full history. Low severity — the current behavior won't break anything, but it could confuse the LLM with duplicate denial context on repeated rounds.

logger.info({ toolName, toolCallId }, 'executeToolStep: tool executed successfully');
return { type: 'completed' as const };
const denials = agent.getTaskDenialRedirects();
const denial = denials.length > 0 ? denials[denials.length - 1] : undefined;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This grabs the last element, relying on the fact that a fresh agent has at most one denial. The assumption holds today since buildAgentForStep creates a new Agent with an empty taskDenialRedirects per step. Worth noting: if executeToolStep is ever changed to reuse an agent across tool calls within a single step, this would silently drop earlier denials. A find matching on toolCallId would be more robust, but not urgent given the current architecture.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(4) Total Issues | Risk: Medium

🟠⚠️ Major (1) 🟠⚠️

🟠 1) system Missing test coverage for durable denial propagation

Issue: The core fix in this PR — accumulating denialRedirects[] across workflow steps, hydrating them on fresh Agent instances, and generating denial-aware post-approval messages — has zero automated test coverage. The test plan in the PR description is manual testing only.

Why: This is a bug fix for a critical user-facing issue (denial context being lost). Without tests:

  • Regressions to this state propagation could go undetected in future refactors
  • The exact bug this PR fixes could silently return
  • The new conditional branching in callLlmStep (denial-aware vs generic message) is untested

Key untested paths:

  1. agentExecution.ts:53,68,133-135 — denial accumulation in workflow loop
  2. agentExecutionSteps.ts:513-515 — hydration of denialRedirects into agent.runContext.taskDenialRedirects
  3. agentExecutionSteps.ts:543-555 — post-approval message generation with denial summary
  4. agentExecutionSteps.ts:1037-1041 — extraction of denial from agent.getTaskDenialRedirects()

Fix: Add test coverage at minimum for:

// agentExecution.test.ts
it('should accumulate denial redirects across tool execution steps', async () => {
  // Mock executeToolStep to return { type: 'completed', denial: {...} }
  // Verify callLlmStep receives denialRedirects array
});

it('should generate denial-aware message when denialRedirects present', async () => {
  // Verify userParts contains denial summary, not generic continuation
});

Refs:

🟡 Minor (3) 🟡

🟡 1) agentExecutionSteps.ts:69 Type duplication: DenialRedirect already exists

Issue: The new DenialRedirect type duplicates the inline type at agent-types.ts:267 for taskDenialRedirects: Array<{ toolName: string; toolCallId: string; reason: string }>.

Why: Type duplication creates maintenance burden. If the denial shape evolves, both locations must be updated.

Fix: Extract a shared type in agent-types.ts:

export type DenialRedirect = { toolName: string; toolCallId: string; reason: string };
// Then in AgentRunContext:
taskDenialRedirects: DenialRedirect[];

Refs:

🟡 2) agentExecutionSteps.ts:547-549 Denial message format diverges from non-durable path

See inline comment.

🟡 3) tool-result-for-conversation-history.ts:36 TOOL_CALL_ID header emoji inconsistent

See inline comment.

Inline Comments:

  • 🟡 Minor: agentExecutionSteps.ts:547-549 Denial message format diverges from non-durable path
  • 🟡 Minor: tool-result-for-conversation-history.ts:36 TOOL_CALL_ID header emoji inconsistent

💭 Consider (1) 💭

💭 1) scope Missing changeset

Issue: Per AGENTS.md, bug fixes to agents-api require a changeset for version bumping and changelog generation. The changeset bot has flagged this.

Fix: Run pnpm bump patch --pkg agents-api "Fix tool denial context lost across durable workflow steps"


💡 APPROVE WITH SUGGESTIONS

Summary: This is a well-targeted fix for a real bug in durable execution mode. The implementation correctly propagates denial context across workflow step boundaries and provides meaningful feedback to the LLM. The code changes are sound — the main gaps are (1) missing test coverage for the core fix, and (2) minor consistency issues with the non-durable path that can be addressed with the inline suggestions. Adding at least a basic test for the denial accumulation logic would significantly reduce regression risk.

Discarded (2)
Location Issue Reason Discarded
agentExecutionSteps.ts:72 Optional denial? field design could lead to missed handling LOW confidence — current workflow structure executes one tool per step, so the denial?: DenialRedirect pattern is pragmatically acceptable. Not a type safety violation.
agentExecutionSteps.ts:550 Post-approval message wording differs from non-durable delegation note MEDIUM confidence — the messages serve different purposes: LLM prompt vs user-facing artifact. Intentional divergence is reasonable.
Reviewers (4)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 0 0 0 0 0 0 0
pr-review-types 2 0 0 0 0 0 2
pr-review-consistency 4 1 0 0 2 0 1
pr-review-tests 5 1 1 0 0 0 0
Total 11 2 1 0 2 0 3

Note: Type duplication finding raised by both types and consistency reviewers — counted once in Main.

Comment on lines +547 to +549
const redirectSummary = denialRedirects
.map((d) => `- ${d.toolName}: ${sanitize(d.reason)}`)
.join('\n');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Denial message format diverges from non-durable path

Issue: The non-durable path in generateTaskHandler.ts:599 includes toolCallId in the denial summary: - ${d.toolName} (${d.toolCallId}): ${sanitize(d.reason)}. This durable path omits it.

Why: Including toolCallId provides traceability for debugging. Consistency between execution paths helps maintain predictable behavior.

Fix:

Suggested change
const redirectSummary = denialRedirects
.map((d) => `- ${d.toolName}: ${sanitize(d.reason)}`)
.join('\n');
const redirectSummary = denialRedirects
.map((d) => `- ${d.toolName} (${d.toolCallId}): ${sanitize(d.reason)}`)
.join('\n');

Refs:

`## Tool: ${toolName} [DENIED BY USER]`,
'',
`### 🔧 TOOL_CALL_ID: ${toolCallId}`,
`### TOOL_CALL_ID: ${toolCallId}`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: TOOL_CALL_ID header format inconsistent within file

Issue: The denied branch now uses ### TOOL_CALL_ID without the wrench emoji, while the successful branch at line 74 uses ### 🔧 TOOL_CALL_ID. This creates inconsistent formatting within the same function.

Why: Inconsistent formatting makes output unpredictable and harder to parse.

Fix:

Suggested change
`### TOOL_CALL_ID: ${toolCallId}`,
`### 🔧 TOOL_CALL_ID: ${toolCallId}`,

Refs:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emojis should be removed throughout

@github-actions github-actions Bot deleted a comment from claude Bot Apr 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 3, 2026

Preview URLs

Use these stable preview aliases for testing this PR:

These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find.

Raw Vercel deployment URLs

@itoqa
Copy link
Copy Markdown

itoqa Bot commented Apr 4, 2026

Ito Test Report ✅

5 test cases ran. 5 passed.

The unified QA run passed completely with 5 of 5 included test cases successful and 0 failures, and two additional code-first verification passes reported no executable cases. Most importantly, durable approval/denial handling behaved as intended in deterministic non-production setups: denial continuation preserved explicit reasons and deterministic fallback text, denial context accumulated across multiple denied calls, approved=true flows stayed free of denial language, duplicate or conflicting approval submissions remained idempotent and stable, hostile denial payloads were rendered as inert text without injection, and code review found no new code-supported application defect (with one issue attributed to environment/model-output instability rather than logic).

✅ Passed (5)
Category Summary Screenshot
Adversarial Hostile denial payload is preserved as inert text, and denial formatting remains stable without injection behavior. ADV-1
Edge Duplicate denial and conflicting decision replay returned stable success responses with no crash. EDGE-3
Logic Not a real application bug; re-execution indicates an environment/model-output blocker while denial accumulation and formatter logic are implemented in source. LOGIC-1
Logic Approved durable flow completed successfully with no denial text in the final assistant response. LOGIC-3
Happy-path Durable denial continuation emitted approval flow and produced denial-aware assistant output with explicit and fallback reason handling under deterministic non-production setup. ROUTE-1

Commit: 249bbe2

View Full Run


Tell us how we did: Give Ito Feedback

Copy link
Copy Markdown
Collaborator

@amikofalvy amikofalvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs changeset

@anubra266 anubra266 merged commit a74daba into feat/durable-delegated-tool-approval Apr 6, 2026
9 of 14 checks passed
@anubra266 anubra266 deleted the feat/durable-denial-context branch April 6, 2026 19:27
@github-actions github-actions Bot deleted a comment from claude Bot Apr 6, 2026
@itoqa
Copy link
Copy Markdown

itoqa Bot commented Apr 6, 2026

Ito Test Report ✅

6 test cases ran. 6 passed.

Overall, the unified QA run passed with 6 of 6 executed test cases successful and 0 failures, and several additional verification passes completed with no includable test cases after report rules were applied. The key validated behaviors were durable run/API stability (idempotent repeated approvals, preserved multi-round denial context, safe fallback for missing denial reasons, support for long/multiline denial text, and execution-stream reconnection by execution ID) plus successful mobile viewport deny-and-follow-up continuity, with all checks performed in local non-production environments using auth/session fallback or mobile bypass paths and no plausible production-code defect identified.

✅ Passed (6)
Category Summary Screenshot
Edge Multi-round denial context is preserved through continuation flow. EDGE-1
Edge Denial without a reason uses a safe fallback path without null leakage. EDGE-2
Edge Long and multiline denial reasons are accepted as normal string input. EDGE-3
Edge Reconnect endpoint supports resuming a suspended execution stream by execution ID. EDGE-4
Mobile Mobile viewport flow passed: approval controls were actionable, deny submitted with reason, and same-conversation follow-up returned continuity response. MOBILE-1
Happy-path Repeated approval submits are handled idempotently and still return success. ROUTE-5

Commit: 7ca0d0c

View Full Run


Tell us how we did: Give Ito Feedback

github-merge-queue Bot pushed a commit that referenced this pull request Apr 7, 2026
* feat: durable delegated tool approval flow

When a delegated sub-agent requires tool approval in durable mode, the approval
now flows through the workflow hook system instead of the in-memory pub/sub bus.
The parent agent detects `durable-approval-required` artifacts from sub-agents,
surfaces them as SSE approval events, and forwards the user's decision back via
`delegatedToolApproval` context on re-execution. Adds `isPostApproval` flag to
prevent the workflow loop from re-sending the original user message after approval.

Includes validation of approval artifact fields, SSE error handling for delegated
approval streaming, logging for suppressed errors in the durable approval catch
path, and type-safe metadata construction for delegation.

* style: auto-format with biome

* fix: use 'as const' for tool_calls type literal to satisfy typecheck

* Fix tool denial context lost across durable workflow steps (#3009)

* Address review: extract tool prefix constants, add shared types, remove redundant localhost fallback

* style: auto-format with biome

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants