feat: add stale suspended workflow cleanup for durable approvals by anubra266 · Pull Request #3063 · inkeep/agents

anubra266 · 2026-04-07T22:33:11Z

Summary

Adds getStaleWorkflowExecutions data-access function to query suspended workflows older than a configurable threshold
Adds POST /run/api/workflows/cleanup-stale endpoint that finds stale suspended workflows and marks them as failed with approval_timeout metadata
Includes integration tests for the data-access function and status update behavior

Test plan

Unit tests pass for getStaleWorkflowExecutions (filtering by status, staleBefore, limit, ordering)
Unit test verifies updateWorkflowExecutionStatus correctly applies timeout metadata
pnpm typecheck passes
pnpm format passes
Pre-commit hooks pass (biome, openapi snapshot, tests)

Closes PRD-6453

Adds a cleanup endpoint and data-access function to find and fail workflows that have been suspended longer than a configurable timeout (default 30 minutes). Prevents indefinite accumulation of orphaned approval requests. Closes PRD-6453

vercel · 2026-04-07T22:33:17Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agents-api	Ready	Preview, Comment	Apr 13, 2026 11:28pm
agents-docs	Ready	Preview, Comment	Apr 13, 2026 11:28pm
agents-manage-ui	Ready	Preview, Comment	Apr 13, 2026 11:28pm

changeset-bot · 2026-04-07T22:33:18Z

🦋 Changeset detected

Latest commit: f39bf1a

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages

Name	Type
@inkeep/agents-core	Patch
@inkeep/agents-api	Patch
@inkeep/agents-manage-ui	Patch
@inkeep/agents-cli	Patch
@inkeep/agents-sdk	Patch
@inkeep/agents-work-apps	Patch
@inkeep/ai-sdk-provider	Patch
@inkeep/create-agents	Patch
@inkeep/agents-email	Patch
@inkeep/agents-mcp	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

pullfrog · 2026-04-07T22:34:36Z

TL;DR — When an SSE stream drops while a durable tool-approval request is in flight, the workflow now survives the failure. Approval state is persisted into workflow execution metadata and exposed via a new polling endpoint so the UI can recover without relying on the original stream.

Key changes

Persist pendingToolApproval in workflow execution metadata — when suspending for tool approval, the workflow writes toolCallId, toolName, args, and isDelegated into the execution's metadata so the approval request outlives any single SSE connection
Add GET /conversations/{id}/pending-approvals endpoint — a polling route that reads the latest suspended workflow execution for a conversation and returns the pending approval details (or hasPending: false), with end-user scoping to prevent cross-user reads
Gracefully handle SSE write failures in waitForToolApproval — wraps writeToolApprovalRequest in a try/catch so a broken stream logs a warning instead of crashing the workflow
Clear pendingToolApproval on resume — sets pendingToolApproval: null in metadata when the workflow transitions back to running, preventing stale approval data from lingering

_{Summary ｜ 7 files ｜ 4 commits ｜ base: main ← prd-6454}

Durable approval state in workflow metadata

Before: Approval request details existed only in the SSE stream and in-memory context — if the stream dropped, the client had no way to discover the pending approval.
After: markWorkflowSuspendedStep writes a pendingToolApproval object into the workflow execution metadata, and markWorkflowResumingStep clears it on resume.

The pendingToolApproval shape (toolCallId, toolName, args, isDelegated) is written just before the Restate hook is created. In agentExecution.ts, the hookToolCallId computation is hoisted above the suspend call so it can be included in the metadata. On resume, metadata is patched with pendingToolApproval: null to avoid stale reads.

agentExecutionSteps.ts · agentExecution.ts

Pending-approvals polling endpoint

Before: No API existed to query whether a conversation had an outstanding approval request.
After: GET /run/v1/conversations/{conversationId}/pending-approvals returns { hasPending, approval? } by checking the most recent workflow execution's status and metadata.

The handler fetches the conversation (404 if missing), enforces end-user scoping by comparing conversation.userId against the caller's endUserId (following the same pattern as the /stream route), looks up the latest workflow execution via getWorkflowExecutionByConversation, and validates the pendingToolApproval metadata with Zod. The response includes workflowRunId and updatedAt so the client has enough context to submit the approval decision. Seven integration tests cover the matrix of workflow states.

Why updatedAt instead of suspendedAt?
The field maps directly to the workflow execution row's updatedAt column. An earlier iteration used suspendedAt, but that implied a dedicated timestamp that doesn't exist — updatedAt is more honest about the source.

conversations.ts · pending-approvals.test.ts

Graceful SSE failure handling

Before: A failed writeToolApprovalRequest call would propagate the error up through the workflow, potentially aborting the approval flow.
After: The call is wrapped in a try/catch that logs a warning and continues — the approval is already persisted in metadata and recoverable via the polling endpoint.

tool-approval.ts

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ Using Claude Opus ｜ 𝕏}

pullfrog

Severity: High — The core data-access and test code are solid, but the route has a cross-tenant authorization gap that needs to be resolved before merging, and a minor consistency issue with the OpenAPIHono instantiation.

^{｜ Fix all ➔ ｜ Fix 👍s ➔ ｜ View workflow run ｜ Using Claude Opus ｜ 𝕏}

claude

PR Review Summary

(8) Total Issues | Risk: High

🔴❗ Critical (2) ❗🔴

🔴 1) workflowExecutions.ts:68-83 Cross-tenant data exposure in getStaleWorkflowExecutions

Issue: The getStaleWorkflowExecutions function queries workflow executions across ALL tenants without filtering by tenantId. This is the only data-access function in this file that lacks tenant scoping — all sibling functions (getWorkflowExecution, getWorkflowExecutionByConversation, updateWorkflowExecutionStatus) correctly filter by tenantId and projectId.

Why: In a multi-tenant system, this allows any authenticated API key holder to discover and modify workflow executions belonging to other tenants. The route handler calls this function, gets back workflows from ALL tenants, then iterates and updates them using each execution's own tenantId — effectively allowing Tenant A's API key to fail Tenant B's workflows.

Fix: Add tenantId as a required parameter and include it in the WHERE clause. See inline comment for the exact code change.

Refs:

🔴 2) workflowCleanup.ts:74-82 Handler ignores authenticated execution context

Issue: The route handler never accesses c.get('executionContext') to retrieve the authenticated tenant/project context, despite being protected by runApiKeyAuth middleware. Every other route handler in /run/api/ extracts and uses this context.

Why: The authentication middleware gates access to the endpoint but provides no authorization — the caller can act on any tenant's data. The handler should extract tenantId from the context and pass it to the data-access function.

Fix: Add context extraction at the start of the handler. See inline comment for the pattern used by sibling handlers.

Refs:

Inline Comments:

🔴 Critical: workflowExecutions.ts:68-83 Missing tenant isolation in query (with fix)
🔴 Critical: workflowCleanup.ts:74-82 Handler ignores execution context

🟠⚠️ Major (3) 🟠⚠️

🟠 1) workflowCleanup.ts:87-109 Partial failures silently succeed

Issue: The batch cleanup catches individual update failures (lines 102-107) and logs them, but returns 200 OK with only { cleanedUp } count. Callers have no visibility into failure rates.

Why: A cron job seeing cleanedUp: 10 might assume everything worked when 90% of the operation failed. This hides systemic issues like database problems or permission errors affecting specific tenants.

Fix: Use Promise.allSettled and return { cleanedUp, failed, total }. Update the response schema accordingly.

Refs:

Batch result pattern in datasetRun.ts

🟠 2) workflowCleanup.ts:87 Unbounded concurrent database writes

Issue: Promise.all() fires up to 1000 parallel UPDATE queries without concurrency limiting. This can overwhelm the database connection pool.

Why: At scale, this pattern causes connection pool exhaustion, query timeouts, and degrades service for concurrent requests.

Fix: Use p-limit or process in chunks. See inline comment for example.

🟠 3) workflowCleanup.ts Missing HTTP endpoint integration tests

Issue: The route handler POST /workflows/cleanup-stale has no integration tests. The existing tests only cover the data-access layer functions.

Why: Without endpoint tests, request validation, authorization, and response format could regress undetected. Authorization bypass could allow DoS by repeatedly failing all suspended workflows.

Fix: Add integration tests for: (1) successful cleanup with correct count, (2) unauthenticated requests return 401, (3) invalid body returns 400, (4) empty result returns { cleanedUp: 0 }.

Inline Comments:

🟠 Major: workflowCleanup.ts:87-109 Partial failures return 200 OK without details
🟠 Major: workflowCleanup.ts:87 Unbounded concurrent database writes

🟡 Minor (3) 🟡

🟡 1) workflowExecutions.ts Missing database index for cleanup query

Issue: The query filters on status = 'suspended' and updatedAt <= staleBefore, then orders by updatedAt. No index exists for (status, updatedAt).

Why: Without an index, this query performs a sequential scan. As the table grows, cleanup jobs become slower and may cause lock contention.

Fix: Add a partial index: CREATE INDEX workflow_executions_stale_cleanup_idx ON workflow_executions (updated_at) WHERE status = 'suspended';

🟡 2) workflowCleanup.ts:79-82 Database query error not explicitly handled

Issue: The getStaleWorkflowExecutions call has no try/catch. Database errors bubble up as unhandled exceptions despite a 500 response schema being defined.

Fix: Wrap in try/catch and return proper 500 response, or document that global middleware handles this.

🟡 3) workflowCleanup.ts:102-107 Error logging lacks tenant context

Issue: Error log includes executionId but omits tenantId and projectId, making incident investigation harder.

Fix: Include tenant context in the log object. See inline comment.

Inline Comments:

🟡 Minor: workflowCleanup.ts:102-107 Error logging lacks tenant correlation

🚫 REQUEST CHANGES

Summary: This PR has a critical security issue — the cleanup endpoint queries and modifies workflows across ALL tenants without tenant scoping. The data-access function getStaleWorkflowExecutions is the only one in its file that doesn't filter by tenantId, and the route handler doesn't extract tenant context from authentication. This must be fixed before merging. Additionally, consider adding concurrency limits and surfacing failure counts in the response.

Discarded (4)

Location	Issue	Reason Discarded
`workflowCleanup.test.ts:128-138`	Empty result test doesn't verify anything meaningful	Other tests already cover `staleBefore` filtering adequately
`workflowCleanup.ts:102-107`	Catch block doesn't distinguish error types	Low confidence — depends on what errors `updateWorkflowExecutionStatus` actually throws; may return null for not-found
`workflowCleanup.ts:79-109`	Race condition between concurrent cleanup calls	While technically valid, the operation is inherently idempotent (setting failed status again is harmless) and the UPDATE filters by id
`workflowCleanup.ts`	Missing observability/tracing	Nice-to-have but not blocking for this maintenance endpoint

Reviewers (5)

Reviewer	Returned	Main Findings	Inline Comments	Discarded
`pr-review-standards`	2	0	2	0
`pr-review-security-iam`	4	2	0	2
`pr-review-errors`	3	1	1	1
`pr-review-tests`	3	1	0	2
`pr-review-sre`	5	2	2	1
Total	17	6	5	6

github-actions · 2026-04-07T22:41:01Z

Preview URLs

Use these stable preview aliases for testing this PR:

UI: https://pr-3063-ui.preview.inkeep.com
API: https://pr-3063-api.preview.inkeep.com
API health: https://pr-3063-api.preview.inkeep.com/health

These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find.

Raw Vercel deployment URLs

UI deployment: https://agents-manage-1k5698t70.preview.inkeep.com
API deployment: https://agents-8dfj4wn20.preview.inkeep.com

Addresses review feedback: - Add tenantId to getStaleWorkflowExecutions query (critical: prevents cross-tenant data exposure) - Extract executionContext in route handler for proper authorization - Use Promise.allSettled to surface failure counts in response - Add concurrency limit to avoid connection pool exhaustion - Include tenant context in error logs - Add tenant isolation test case - Type OpenAPIHono with AppVariables for consistency Closes PRD-6453

pullfrog · 2026-04-07T23:04:27Z

Review feedback status: No action required.

The single review thread on workflowExecutions.ts:88 (tenant isolation concern) is already resolved. The author addressed it in commit 2341408 by adding tenantId as a required parameter to getStaleWorkflowExecutions and including it in the WHERE clause. The fix looks correct.

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ Using Claude Opus ｜ 𝕏}

pullfrog · 2026-04-07T23:04:30Z

Verified all review feedback has been addressed in commit 2341408. All 10 threads are resolved — no further action needed.

Fixes confirmed:

getStaleWorkflowExecutions now requires and filters by tenantId
Handler extracts executionContext and passes tenantId to data-access
OpenAPIHono typed with AppVariables
Promise.allSettled with { cleanedUp, failed, total } response
Concurrency limiting (batches of 10)
Error logs include tenantId and projectId
Tenant isolation test with cross-tenant assertions

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ Using Claude Opus ｜ 𝕏}

pullfrog · 2026-04-07T23:04:32Z

No action needed. The single review thread on workflowCleanup.ts:9 (typing OpenAPIHono with AppVariables) was already fixed by the PR author in commit 2341408 and the thread is resolved.

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ Using Claude Opus ｜ 𝕏}

pullfrog · 2026-04-07T23:04:32Z

All 8 review threads have been addressed by the PR author in commit 2341408 and are already resolved. No further action needed.

Fixes confirmed:

Cross-tenant data exposure: tenantId scoping added to getStaleWorkflowExecutions
Handler context extraction: executionContext now used for tenant scoping
OpenAPIHono typed with AppVariables
Promise.allSettled with { cleanedUp, failed, total } response
Concurrency limiting (batches of 10)
Error logs include tenantId and projectId
Tenant isolation test added

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ Using Claude Opus ｜ 𝕏}

pullfrog · 2026-04-07T23:04:37Z

All 5 review threads from the original review are already resolved by the PR author in commit 2341408. The triggering review comment confirmed the test replacement for tenant isolation. No further code changes are needed.

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ Using Claude Opus ｜ 𝕏}

claude

PR Review Summary

(1) Total Issues | Risk: Medium

Re-review scoped to delta — Prior critical issues (cross-tenant data exposure, missing failure counts, missing tenant context in logs) were correctly addressed in commit 2341408b9.

🟠⚠️ Major (1) 🟠⚠️

Inline Comments:

🟠 Major: workflowCleanup.ts:105-127 Staggered delay doesn't effectively limit concurrent DB connections

💭 Consider (1) 💭

💭 1) workflowCleanup.ts:97-101 No mutex for concurrent cleanup invocations

Issue: The cleanup endpoint has no guard against concurrent invocations from cron job retries or load balancer duplicates.

Why: Both requests would query the same stale workflows and attempt to update them, causing unnecessary database contention. However, since the UPDATE is idempotent (setting status: 'failed' twice has no adverse effect) and limit bounds the blast radius, this is a lower-severity concern.

Fix: Consider documenting that callers should ensure mutual exclusion, or add a distributed lock in a future iteration if concurrent invocations become problematic in production.

💡 APPROVE WITH SUGGESTIONS

Summary: The critical security issues from the prior review were correctly fixed — tenant scoping is now properly enforced at both the DAL and handler layers, and the response includes failure metrics. The remaining issue is the concurrency limiting pattern which doesn't truly bound concurrent database connections. This should be addressed before heavy production use, but doesn't block the PR. Nice work on the comprehensive tenant isolation test!

Reviewers (3)

Reviewer	Returned	Consider	Inline Comments
`pr-review-sre`	2	1	1
`pr-review-standards`	0	0	0
`pr-review-tests`	0	0	0
Total	2	1	1

itoqa · 2026-04-08T00:30:54Z

Ito Test Report ✅

13 test cases ran. 13 passed.

Across 13 executed cases for POST /run/api/workflows/cleanup-stale, all passed (13/13, 0 failures), and code-first review found no plausible production defects while confirming core behavior: successful cleanup with correct aggregate counters, default parameter parity, zero-work handling, invalid-limit 400 validation, oldest-first selection with limit, exact stale-boundary inclusion, immediate rerun idempotency, metadata key preservation plus timeout markers, and strict tenant isolation. Adversarial and UX checks also held—missing/malformed auth and cross-tenant header forgery were rejected without unauthorized mutations (typically 404 Project not found in local non-production fallback), a 20-request concurrent flood produced only 200s with no 5xx and left no residual stale work, and docs/mobile back-forward deep-link navigation triggered cleanup only on intentional execute actions, with results gathered under deterministic local fixtures and test-only non-production bypasses where noted.

✅ Passed (13)

Category	Summary	Screenshot
Adversarial	Local dev fallback returned 404 Project not found for missing and malformed bearer requests, matching non-production auth caveat behavior.
Adversarial	20 concurrent cleanup requests completed with 200 responses and no 5xx, then a follow-up cleanup reported zero remaining work.
Adversarial	Forged tenant/project headers did not mutate unauthorized tenant data; request returned 404 and seeded rows remained suspended.
Adversarial	Docs cleanup flow remained stable across desktop and mobile back/forward navigation, with only intentional execute actions observed in the verification run.
Edge	Stale boundary logic includes records at exactly staleBefore and excludes newer records.
Edge	Empty stale-set scenario returned 200 with cleanedUp=0, failed=0, total=0.
Edge	Invalid limit inputs (0, 1001, and string) were rejected with HTTP 400 validation errors.
Edge	Immediate repeated cleanup calls are idempotent after stale items are already processed.
Logic	Cleanup selects stale suspended executions oldest-first when limit=1 across sequential runs.
Logic	Re-execution confirmed cleanup preserves existing metadata keys and adds timeout markers for stale workflows in the scoped tenant.
Logic	Re-execution confirmed cleanup remains tenant-isolated, with no cross-tenant mutation before reciprocal scoped execution.
Happy-path	Cleanup returned 200 with expected aggregate counters and expected stale/fresh status outcomes after local fixture gating was addressed.
Happy-path	Empty-body request matched explicit default behavior and OpenAPI defaults for timeoutMinutes and limit.

Commit: 2341408

View Full Run

Tell us how we did: Give Ito Feedback

pullfrog · 2026-04-08T00:55:16Z

Review submitted. Leaping into action...

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ Using Claude Opus ｜ 𝕏}

pullfrog · 2026-04-08T00:57:09Z

Addressed review feedback from @amikofalvy — updated getStaleWorkflowExecutions to use the scopes: TenantScopeConfig interface pattern, matching the convention used across all other data-access functions.

Changes in 53aedc3:

workflowExecutions.ts: Changed tenantId: string → scopes: TenantScopeConfig, imported TenantScopeConfig from ../../types/utility
workflowCleanup.ts (route): Updated call site to pass scopes: { tenantId }
workflowCleanup.test.ts: Updated all 6 test call sites to use the new scopes parameter shape

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ Using Claude Opus ｜ 𝕏}

amikofalvy

I have open question about why we need this endpoint and who calls it?

…ons`

Moves the stale workflow cleanup from a run-domain API endpoint to a proper Vercel cron handler at /api/cron/cleanup-stale-workflows, following the existing cleanupStreamChunks pattern: - noAuth() + CRON_SECRET bearer validation - System-level (cross-tenant) cleanup - Runs every 30 minutes via vercel.json cron schedule - Removes the tenant-scoped run domain route

claude

PR Review Summary

(2) Total Issues | Risk: Medium

Re-review scoped to delta — Changes since commit 2341408b9: Converted to Vercel cron job pattern (addressing @amikofalvy's question about endpoint caller), removed tenant scoping from DAL (now correctly cross-tenant for system cron), used scopes: TenantScopeConfig pattern (though import now unused).

🟠⚠️ Major (2) 🟠⚠️

Inline Comments:

🟠 Major: workflowExecutions.ts:5 Unused TenantScopeConfig import left over from refactoring
🟠 Major: cleanupStaleWorkflows.ts:69-89 Staggered delay doesn't effectively limit concurrent DB connections (persists from prior review)

💭 Consider (1) 💭

💭 1) cleanupStaleWorkflows.ts:113-116 Missing correlation ID for cleanup runs

Issue: The completion log lacks a correlation ID, making it harder to trace cleanup operations across logs.

Why: For a cross-tenant cron processing up to 100 workflows every 30 minutes, operators cannot correlate log entries with other system events without a run ID.

Fix: Generate a cleanupRunId at handler start and include in all logs:

const cleanupRunId = crypto.randomUUID();
logger.info({ cleanupRunId, cleanedUp, failed, total }, 'Stale workflow cleanup completed via cron');

Refs: Low priority — follows existing cleanupStreamChunks pattern which also lacks this.

✅ Security Review Passed

The cron auth pattern (noAuth() + CRON_SECRET bearer auth + constantTimeEqual) correctly matches the established cleanupStreamChunks.ts pattern. Cross-tenant query is now intentional and appropriate for a background cleanup job. The updateWorkflowExecutionStatus calls are still scoped to each execution's tenant/project.

💡 APPROVE WITH SUGGESTIONS

Summary: Good progress! The architectural shift to a Vercel cron job correctly resolves the prior cross-tenant security concerns — cross-tenant cleanup is now intentional for a privileged system cron. The cron auth pattern matches the established peer exactly. Two items to address: (1) remove the unused TenantScopeConfig import, and (2) the concurrency limiting pattern still doesn't truly bound concurrent DB connections (same issue from prior review). Neither blocks merge, but the concurrency pattern should be fixed before heavy production use.

Discarded (1)

Location	Issue	Reason Discarded
`workflowCleanup.test.ts`	Removed tenant isolation test	Correctly removed — tenant isolation test no longer applies since cross-tenant is now by design for cron

Reviewers (3)

Reviewer	Returned	Consider	Inline Comments
`pr-review-sre`	2	1	1
`pr-review-security-iam`	0	0	0
`pr-review-standards`	1	0	1
Total	3	1	2

itoqa · 2026-04-08T01:43:08Z

Ito Test Report ✅

14 test cases ran. 14 passed.

The unified run passed completely with 14/14 test cases successful and 0 failures, indicating no code-supported product defects for the cleanup-stale-workflows endpoint under local deterministic test conditions. Key findings were that authorization is consistently fail-closed (missing/malformed/adversarial/session/mobile direct-hit requests return 401/431 without mutation), while valid bearer requests correctly enforce stale-only cleanup logic (inclusive 30-minute boundary, non-suspended exclusion, metadata-preserving timeout merge), process oldest-first in bounded batches of 100 with isolated partial-failure accounting, remain idempotent across repeats, and converge safely under concurrent load without 5xx errors.

✅ Passed (14)

Category	Summary	Screenshot
Adversarial	After reseeding 30 stale rows, 10 parallel authorized cleanup requests produced no 5xx and converged to stable terminal state (all non-fault stale rows failed, no oscillation observed).
Adversarial	Adversarial bearer payloads failed closed (401/431) with no leakage and no unauthorized cleanup mutation.
Adversarial	With an active Manage UI session, direct API navigation without bearer auth still returned 401 Unauthorized.
Adversarial	In mobile viewport, direct-hit/back/reload loops on the cleanup endpoint stayed unauthorized and normal UI navigation remained usable.
Edge	Authorized cleanup endpoint returned HTTP 200 with ok=true and cleanedUp=0 when stale suspended workflow count was zero before and after execution.
Edge	Injected forceFail row remained suspended while other eligible stale rows transitioned to failed; response counters tracked partial failure without aborting the batch.
Edge	Immediate repeat cleanup processed remaining backlog, and subsequent call reached cleanedUp=0 with only the intentionally failing fault row left pending failure attempts.
Logic	Boundary behavior validated as expected after controlled re-check; no product defect confirmed.
Logic	Non-suspended rows remained unchanged after cleanup, confirming status filtering.
Logic	With 121 stale seeded rows, first authorized cleanup processed exactly 100 rows (cleanedUp=99, failed=1 due to injected fault), matching oldest-first bounded batch behavior.
Logic	Existing metadata was preserved while timeout fields were appended on stale-row conversion.
Happy-path	Authorized cleanup on seeded data returned 200 with counters and only stale suspended workflow was transitioned to failed while fresh suspended workflow remained unchanged.
Happy-path	No-header request returned 401 Unauthorized and seeded stale marker row remained suspended (no mutation).
Happy-path	All malformed Authorization formats returned 401 Unauthorized and did not mutate seeded workflow row.

Commit: 5d5ad5a

View Full Run

Tell us how we did: Give Ito Feedback

github-actions · 2026-04-13T23:25:16Z

A matching internal PR is ready in inkeep/agents-private#83 for canonical review and merge.

Original author attribution is preserved as @anubra266
The internal PR is the authoritative merge surface
The public repo will pick up the merged change through the normal mirror sync

This comment will be updated as the bridge state changes.

claude

PR Review Summary

(2) Total Issues | Risk: Low

Re-review scoped to delta — The code has not substantively changed since the last automated review (2d5c46912344). Prior critical issues (cross-tenant data exposure, missing failure counts) were correctly addressed. The remaining prior findings (unused import, concurrency pattern) persist but are acknowledged by the author.

🟡 Minor (1) 🟡

Inline Comments:

🟡 Minor: workflowCleanup.test.ts:61 Incomplete assertion for staleBefore filter test

💭 Consider (1) 💭

💭 1) workflowCleanup.test.ts:139-179 Missing edge case test for update on non-existent workflow

Issue: The tests don't cover the case where updateWorkflowExecutionStatus is called on a workflow that doesn't exist (returns null). The handler counts any non-rejected promise as success.

Why: If a workflow is deleted between getStaleWorkflowExecutions and updateWorkflowExecutionStatus (race condition), the update returns null but is counted as "cleaned up". This could lead to misleading metrics, though not data corruption.

Fix: Add a test verifying updateWorkflowExecutionStatus returns null for non-existent workflow IDs. Consider whether the handler should distinguish between "updated" and "already deleted" cases in its success count.

🕐 Pending Recommendations (2)

🟠 workflowExecutions.ts:5 Unused TenantScopeConfig import left over from refactoring — will fail lint
🟠 cleanupStaleWorkflows.ts:69-89 Staggered delay doesn't effectively limit concurrent DB connections (author acknowledged: sufficient for cron use case)

✅ Security Review Passed

The cron auth pattern (noAuth() + CRON_SECRET bearer auth + constantTimeEqual) correctly matches the established cleanupStreamChunks.ts pattern. Cross-tenant query is intentional and appropriate for a background cleanup job. The updateWorkflowExecutionStatus calls are properly scoped to each execution's tenant/project.

✅ APPROVE

Summary: This PR is ready to merge. The implementation correctly follows established patterns for Vercel cron jobs and addresses all critical issues from prior reviews. The remaining pending recommendations are minor (unused import, concurrency pattern acknowledged as acceptable for this use case). The test coverage gaps identified are minor edge cases that don't block the feature. Nice implementation of the stale workflow cleanup! 🎉

Discarded (0)

No findings discarded in this review cycle.

Reviewers (4)

Reviewer	Returned	Consider	Inline Comments
`pr-review-standards`	0	0	0
`pr-review-sre`	0	0	0
`pr-review-tests`	2	1	1
`pr-review-appsec`	0	0	0
Total	2	1	1

Note: Standards, SRE, and AppSec reviewers returned no new findings — all previously identified issues remain in Pending Recommendations.

claude · 2026-04-13T23:28:06Z

+    });
+
+    const ids = result.map((r) => r.id);
+    expect(ids).toContain(staleExecution.id);


🟡 Minor: Incomplete assertion for staleBefore filter

Issue: This test creates both a stale and recent execution but only asserts that the stale execution IS in the results. It does not assert that the recent execution is NOT in the results.

Why: If the filter logic were broken and returned all suspended workflows regardless of time, this test would still pass. The assertion is incomplete for validating the staleBefore filter behavior.

Fix: Store the return value of the second createExecution call and add:

const recentExecution = await createExecution({ status: 'suspended', updatedAt: recentTime, }); // ... existing assertions ... expect(ids).not.toContain(recentExecution.id);

itoqa · 2026-04-14T00:03:23Z

Ito Test Report ✅

14 test cases ran. 14 passed.

The unified local non-production run passed all 14/14 test cases with zero failures, confirming the cleanup endpoint’s behavior matched code expectations throughout. Key findings were that auth hardening consistently returned 401 for missing/malformed/incorrect credentials and 200 only for the valid bearer token, cleanup logic correctly targeted only stale suspended workflows (adding timeout metadata while preserving existing metadata, respecting 100-item oldest-first limits, and remaining idempotent), and resilience checks showed isolated partial-failure reporting plus safe convergence under repeated, concurrent, method/path confusion, deep-link/history, and desktop/mobile access scenarios without unintended mutations.

✅ Passed (14)

Category	Summary	Screenshot
Adversarial	20 parallel authorized cleanup calls all returned 200 and converged to failed stale rows with no suspended leftovers.
Adversarial	Invalid methods/wrong path stayed non-mutating, unauthorized deep-link stayed JSON 401, and authorized desktop/mobile calls returned valid cleanup JSON.
Adversarial	Deep-link history loops did not duplicate side effects: first authorized call cleaned eligible stale rows and subsequent looped calls reported no additional cleanup.
Edge	Existing metadata keys were preserved while timeout fields were appended.
Edge	Authorized cleanup reported isolated partial failure (`failed=1`) while other stale rows completed and concurrent requests still converged.
Edge	Repeated successful runs converged deterministically (100, then 20, then 0) with no reprocessing after convergence.
Logic	Authorized cleanup mutated only stale suspended rows and appended timeout metadata.
Logic	Stale non-suspended records remained unchanged after cleanup.
Logic	Recently updated suspended record remained suspended and unmodified.
Logic	First cleanup processed the oldest 100 stale suspended rows, second processed the remaining 20, and third was idempotent with zero additional cleanup.
Mobile	Mobile viewport preserved API semantics: unauthorized access stayed JSON unauthorized and authorized access returned parseable cleanup JSON with one-time mutation.
Happy-path	Missing Authorization was rejected (401), while a valid bearer token reached the cleanup handler and returned a successful no-op response.
Happy-path	Malformed and incorrect bearer tokens are securely rejected (401 from route logic; malformed newline payload rejected upstream as bad request).
Happy-path	Authorized cleanup succeeded with a no-op result when no stale workflows were eligible, and unauthorized probes still returned 401.

Commit: fbaff7c

View Full Run

Tell us how we did: Give Ito Feedback

claude

PR Review Summary

(1) Total Issues | Risk: Low

Re-review scoped to delta — The commits since last automated review (fbaff7c00..HEAD) are merge commits from main only. The PR code has not substantively changed since the previous APPROVE. All critical issues from earlier reviews have been addressed.

🟡 Minor (1) 🟡

Inline Comments:

🟡 Minor: workflowExecutions.ts:5 Unused TenantScopeConfig import will fail lint

🕐 Pending Recommendations (2)

🟡 workflowExecutions.ts:5 Unused TenantScopeConfig import — addressed via inline comment above
🟠 cleanupStaleWorkflows.ts:69-89 Staggered delay doesn't effectively limit concurrent DB connections — author acknowledged as acceptable for cron use case

✅ Implementation Review

The implementation correctly follows established patterns:

Aspect	Status
Cron auth pattern	✅ Matches `cleanupStreamChunks.ts` (`noAuth()` + CRON_SECRET + `constantTimeEqual`)
Cross-tenant query	✅ Intentional and appropriate for background cleanup job
Error handling	✅ `Promise.allSettled` with `cleanedUp/failed/total` response
Tenant context in logs	✅ Includes `tenantId`, `projectId` in error logs
OpenAPIHono typing	✅ Correctly typed with `AppVariables`
Cron schedule	✅ Registered in `vercel.json` (every 30 minutes)

✅ APPROVE

Summary: This PR is ready to merge. The stale suspended workflow cleanup implementation correctly follows established patterns for Vercel cron jobs. All critical issues from prior reviews have been addressed. The only remaining item is removing the unused TenantScopeConfig import (minor, will fail lint). Nice work on the durable approvals cleanup! 🎉

Discarded (0)

No findings discarded in this review cycle.

Reviewers (1)

Reviewer	Returned	Main Findings	Consider	While You're Here	Inline Comments	Pending Recs	Discarded
`orchestrator`	1	0	0	0	1	2	0
Total	1	0	0	0	1	2	0

Note: This is a re-review with minimal delta (merge commits only). Sub-reviewers not dispatched as code unchanged.

claude · 2026-04-15T20:10:22Z

 import type { AgentsRunDatabaseClient } from '../../db/runtime/runtime-client';
 import { workflowExecutions } from '../../db/runtime/runtime-schema';
 import type { WorkflowExecutionInsert, WorkflowExecutionSelect } from '../../types/index';
+import type { TenantScopeConfig } from '../../types/utility';


🟡 Minor: Unused import

Issue: TenantScopeConfig is imported but not used anywhere in this file.

Why: This will fail Biome lint checks (noUnusedImports rule).

Fix: (1-click apply)

Suggested change

import type { TenantScopeConfig } from '../../types/utility';

import type { WorkflowExecutionInsert, WorkflowExecutionSelect } from '../../types/index';

Refs:

Prior review raised this

Biome noUnusedImports rule

itoqa · 2026-04-15T21:34:31Z

Ito Test Report ✅

10 test cases ran. 10 passed.

All 10 test cases passed in a local non-production run, confirming the cleanup-stale-workflows cron route behaved correctly under functional, security, edge, and integration scenarios with no observed regressions. Key findings were that only stale suspended workflows (including boundary-time candidates) were transitioned to failed while fresh/completed rows stayed unchanged, metadata merges preserved existing keys while adding timeout markers, auth was strictly enforced with consistent 401s for missing/malformed bearer tokens, processing was capped to the oldest 100 items, parallel calls and partial row-level failures were handled safely with accurate accounting, and downstream execution/pending-approval endpoints reflected the cleanup transitions (including cross-tenant effects from the current global query).

✅ Passed (10)

Category	Summary	Screenshot
Adversarial	Anonymous request to cleanup endpoint returned HTTP 401 with Unauthorized body on localhost non-production target.
Adversarial	All malformed/incorrect Authorization bearer variants were denied with HTTP 401 and no 5xx responses.
Adversarial	Five parallel cleanup calls completed without state corruption; all 20 rows ended in failed state.
Adversarial	Cross-tenant stale executions transitioned from suspended to failed after one authorized cleanup call.
Edge	Cleanup processed exactly 100 oldest stale rows and left rows 101-130 suspended.
Logic	Cleanup preserved existing metadata keys while adding timeout failure metadata on stale suspended execution.
Logic	Injected one row-level failure and confirmed cleanup still updated other rows with failed=1 reported.
Happy-path	Authorized cleanup correctly transitioned stale and boundary suspended workflows while leaving non-target rows unchanged.
Happy-path	Execution endpoint reflected suspended -> failed transition and updatedAt advanced after cleanup.
Happy-path	Pending approvals endpoint changed from hasPending=true to hasPending=false after cleanup.

Commit: f39bf1a

View Full Run

Tell us how we did: Give Ito Feedback

github-actions · 2026-04-23T00:48:24Z

This pull request has been automatically marked as stale because it has not had recent activity.
It will be closed in 7 days if no further activity occurs.

If this PR is still relevant:

Rebase it on the latest main branch
Add a comment explaining its current status
Request a review if it's ready

Thank you for your contributions!

vercel Bot deployed to Preview – agents-docs April 7, 2026 22:35 View deployment

vercel Bot deployed to Preview – agents-manage-ui April 7, 2026 22:36 View deployment

vercel Bot deployed to Preview – agents-api April 7, 2026 22:37 View deployment

pullfrog Bot reviewed Apr 7, 2026

View reviewed changes

claude Bot reviewed Apr 7, 2026

View reviewed changes

github-actions Bot deleted a comment from claude Bot Apr 7, 2026

vercel Bot deployed to Preview – agents-manage-ui April 7, 2026 23:05 View deployment

vercel Bot deployed to Preview – agents-docs April 7, 2026 23:06 View deployment

vercel Bot deployed to Preview – agents-api April 7, 2026 23:08 View deployment

claude Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread agents-api/src/domains/run/routes/workflowCleanup.ts Outdated

github-actions Bot deleted a comment from claude Bot Apr 7, 2026

amikofalvy reviewed Apr 8, 2026

View reviewed changes

Comment thread agents-api/src/domains/run/routes/workflowCleanup.ts Outdated

amikofalvy reviewed Apr 8, 2026

View reviewed changes

refactor: use scopes: TenantScopeConfig in `getStaleWorkflowExecuti…

53aedc3

…ons`

vercel Bot deployed to Preview – agents-manage-ui April 8, 2026 01:02 View deployment

vercel Bot deployed to Preview – agents-docs April 8, 2026 01:03 View deployment

vercel Bot deployed to Preview – agents-api April 8, 2026 01:04 View deployment

anubra266 and others added 2 commits April 7, 2026 18:07

style: auto-format with biome

5d5ad5a

vercel Bot temporarily deployed to Preview – agents-docs April 8, 2026 01:09 Inactive

vercel Bot deployed to Preview – agents-manage-ui April 8, 2026 01:10 View deployment

vercel Bot deployed to Preview – agents-api April 8, 2026 01:12 View deployment

claude Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread packages/agents-core/src/data-access/runtime/workflowExecutions.ts

Comment thread agents-api/src/routes/cleanupStaleWorkflows.ts

github-actions Bot deleted a comment from claude Bot Apr 8, 2026

anubra266 requested a review from amikofalvy April 9, 2026 16:57

Merge branch 'main' into prd-6453

fbaff7c

vercel Bot deployed to Preview – agents-manage-ui April 13, 2026 23:27 View deployment

vercel Bot deployed to Preview – agents-docs April 13, 2026 23:27 View deployment

claude Bot reviewed Apr 13, 2026

View reviewed changes

vercel Bot deployed to Preview – agents-api April 13, 2026 23:28 View deployment

github-actions Bot deleted a comment from claude Bot Apr 13, 2026

Merge branch 'main' into prd-6453

f39bf1a

claude Bot reviewed Apr 15, 2026

View reviewed changes

github-actions Bot deleted a comment from claude Bot Apr 15, 2026

github-actions Bot added the stale label Apr 23, 2026

	import type { TenantScopeConfig } from '../../types/utility';
	import type { WorkflowExecutionInsert, WorkflowExecutionSelect } from '../../types/index';

Conversation

anubra266 commented Apr 7, 2026

Summary

Test plan

Uh oh!

vercel Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

pullfrog Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key changes

Durable approval state in workflow metadata

Pending-approvals polling endpoint

Graceful SSE failure handling

Uh oh!

pullfrog Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

PR Review Summary

🔴❗ Critical (2) ❗🔴

🟠⚠️ Major (3) 🟠⚠️

🟡 Minor (3) 🟡

🚫 REQUEST CHANGES

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Preview URLs

Uh oh!

pullfrog Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pullfrog Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pullfrog Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pullfrog Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pullfrog Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

PR Review Summary

🟠⚠️ Major (1) 🟠⚠️

💭 Consider (1) 💭

💡 APPROVE WITH SUGGESTIONS

Uh oh!

Uh oh!

itoqa Bot commented Apr 8, 2026

Ito Test Report ✅

Uh oh!

pullfrog Bot commented Apr 8, 2026

Uh oh!

pullfrog Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

amikofalvy left a comment

Choose a reason for hiding this comment

Uh oh!

claude Bot left a comment

vercel Bot commented Apr 7, 2026 •

edited

Loading

changeset-bot Bot commented Apr 7, 2026 •

edited

Loading

pullfrog Bot commented Apr 7, 2026 •

edited

Loading

github-actions Bot commented Apr 7, 2026 •

edited

Loading

pullfrog Bot commented Apr 7, 2026 •

edited

Loading

pullfrog Bot commented Apr 7, 2026 •

edited

Loading

pullfrog Bot commented Apr 7, 2026 •

edited

Loading

pullfrog Bot commented Apr 7, 2026 •

edited

Loading

pullfrog Bot commented Apr 7, 2026 •

edited

Loading

pullfrog Bot commented Apr 8, 2026 •

edited

Loading