spec: reactive mid-generation compression + artifact exclusion by tim-inkeep · Pull Request #3129 · inkeep/agents

tim-inkeep · 2026-04-14T17:46:58Z

Summary

Ready-for-implementation spec for refactoring agents-api compression triggers. Spec only — no runtime code changes. Ship decisions are locked; implementation follows.

What changes (per the spec)

Mid-generation compression becomes reactive. Remove the proactive prepareStep token-budget compression branch at ai-sdk-callbacks.ts:11-191. Introduce an AI SDK wrapLanguageModel middleware (compressionRetryMiddleware) that catches provider-signaled context-overflow errors and retries once with compressed input — preserving the multi-step loop.
Oversized tool outputs excluded at their true seam. tool-wrapper.ts:111's toModelOutput gains an oversized check. Oversized raw outputs return a structured error-shaped tool result matching default-tools.ts:214-246's retrieval_blocked shape — no bytes reach the LLM.
Pre-generation conversation-history compression unchanged. Acts as the safety net for non-tool-result context growth.

Spec artifacts

specs/2026-04-14-reactive-compression/SPEC.md — full spec, 14 locked decisions, 8 verifiable requirements, agent constraints.
evidence/ — four evidence files (compression triggers, oversized-artifact handling, provider overflow signals, middleware approach) with confidence-labeled findings and file:line citations.
meta/audit-findings.md + meta/design-challenge.md — cold-reader audit + design-challenge output.
meta/_changelog.md — full audit-resolution trail.

Key design decisions

ID	Decision	Status
DEC-06	`wrapLanguageModel` middleware; NOT outer try/catch (1-way door)	LOCKED
DEC-09	Two-context split: error-shaped stub for initial execution, compact text reference for conversation-history rebuild	LOCKED
DEC-11	Reject AI SDK `transformParams` (proactive hook — contradicts reactive design)	LOCKED
DEC-12	Surgery seam: `tool-wrapper.ts:111` wrapped `toModelOutput`	LOCKED
DEC-13	Middleware constructed per-request via factory closing over run context	LOCKED

Test plan

Review SPEC.md §§1-6 (problem framing, architecture, design details) for alignment with intent
Review §10 Decision Log — confirm all 14 decisions match expectations
Review §16 Agent Constraints — SCOPE / EXCLUDE / STOP_IF / ASK_FIRST before implementation kicks off
Sanity-check evidence file:line citations against main HEAD
Verify Future Work tiers (Explored / Identified / Noted) and retry-budget revisit trigger

🤖 Generated with Claude Code

Ready-for-implementation spec for refactoring agents-api compression triggers: - Mid-gen compression becomes reactive (via AI SDK wrapLanguageModel middleware that catches provider-signaled overflow and retries once with compressed input) - Oversized tool outputs excluded at tool-wrapper.ts toModelOutput seam with structured error-shaped tool result (matching default-tools.ts retrieval_blocked) - Pre-generation conversation-history compression unchanged (safety net) Includes four evidence files (compression triggers, oversized artifact handling, provider overflow signals, middleware approach), audit + challenger findings, and audit-resolution changelog. 14 locked/directed decisions at HIGH confidence. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

vercel · 2026-04-14T17:47:05Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agents-api	Ready	Preview, Comment	Apr 14, 2026 6:43pm
agents-docs	Ready	Preview, Comment	Apr 14, 2026 6:43pm
agents-manage-ui	Ready	Preview, Comment	Apr 14, 2026 6:43pm

changeset-bot · 2026-04-14T17:47:06Z

⚠️ No Changeset found

Latest commit: f0175d6

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

github-actions · 2026-04-14T17:47:31Z

A matching internal PR is ready in inkeep/agents-private#90 for canonical review and merge.

Original author attribution is preserved as @tim-inkeep
The internal PR is the authoritative merge surface
The public repo will pick up the merged change through the normal mirror sync

This comment will be updated as the bridge state changes.

pullfrog · 2026-04-14T17:49:14Z

TL;DR

This PR adds a comprehensive spec (status: ready-for-implementation) for replacing the current proactive, token-budget-based mid-generation compression in agents-api with a reactive approach that compresses only when the LLM provider actually returns a context-overflow error. It also specifies excluding oversized tool outputs at the tool-wrapper seam before they ever reach the LLM, using a structured error-shaped tool result.

Key Changes

Reactive compression via AI SDK middleware: Introduces wrapLanguageModel middleware (compressionRetryMiddleware) that intercepts doGenerate/doStream calls. On a provider context-overflow error (HTTP 400), it compresses the prompt and retries once. Second failures propagate. This replaces the proactive prepareStep token-budget trigger.
Oversized tool output exclusion: Wraps toModelOutput at tool-wrapper.ts:111 to detect oversized raw tool outputs (>30% of context window) and substitute a structured { status: 'oversized', toolInfo, recommendation } error-shaped result matching the existing retrieval_blocked pattern in default-tools.ts.
Provider-aware overflow detection: New isContextOverflowError() helper that distinguishes OpenAI (context_length_exceeded code) and Anthropic (regex on error messages), explicitly excludes 413 byte-limit errors, and emits telemetry on regex hits for drift monitoring.
Removal of _oversizedWarning injection from both BaseCompressor.ts and ArtifactService.ts; metadata.isOversized remains as the durable signal.
14 locked/directed decisions covering retry budget (1), terminal failure behavior (hard fail), middleware vs. outer try/catch (1-way door), stream peek mechanics (async-iterator .next(), no tee()), per-request middleware construction, and explicit rejection of transformParams.
8 requirements with verifiable acceptance criteria, a detailed verification plan (unit, integration, telemetry, regression tests), and a blast-radius analysis.

SPEC.md -- Main Specification

The core spec document (465 lines) structured in 17 sections:

Problem Statement (SCR): Mid-generation compression fires proactively at 75-91% of context based on prediction, causing false-positive compressions that destructively summarize context the LLM would have successfully consumed. Additionally, oversized tool outputs reach the LLM without any check at the initial tool execution path.
Architecture: Adds a per-request compressionRetryMiddleware via wrapLanguageModel that intercepts at the per-step doGenerate/doStream boundary. For streaming, uses a peek-first-chunk pattern via async-iterator .next() to detect pre-commit overflow errors with negligible latency impact.
Oversized exclusion: Surgery at tool-wrapper.ts:111's toModelOutput hook -- detects oversized outputs and returns a structured error-shaped tool result so the LLM can self-correct (narrow queries, different approach).
Decision Log: 14 decisions (DEC-01 through DEC-14) covering all architectural choices, with reversibility ratings and rationale. Notable 1-way door: DEC-06 (middleware, not outer try/catch) with evidence that outer retry at step N>1 would lose completed tool calls.
Scope constraints: Explicit SCOPE, EXCLUDE, STOP_IF, and ASK_FIRST lists for implementers. 2 new files, 7 changed files, associated test updates.

evidence/current-compression-triggers.md -- Compression Trigger Audit

Maps all existing compression trigger points in agents-api. Confirms mid-generation compression is fully proactive (fires on prediction before each step), uses tiered thresholds (75%/83%/91% by model size), and that no provider-error recovery exists around generateText/streamText. Also confirms conversation-history compression is separate and unchanged.

evidence/middleware-approach.md -- Why Middleware Works

Evidence that wrapLanguageModel middleware makes retry invisible to the AI SDK multi-step loop. Explains stream peek mechanics, confirms post-commit retry is unsafe (once deltas are emitted, state is committed), and argues TTFT impact is negligible (microseconds of object property access on the already-awaited first chunk).

evidence/oversized-artifact-handling.md -- Artifact Path Analysis

Traces how oversized artifacts currently flow through the system. Key findings: detection (detectOversizedArtifact) only returns a result object; _oversizedWarning is injected at two downstream sites (BaseCompressor.ts:407, ArtifactService.ts:751); the actual LLM-facing seam is tool-wrapper.ts:111's toModelOutput (which has no oversized check today); and a retrieval-path exclusion already exists at default-tools.ts:214-246 providing the template shape.

evidence/provider-overflow-signals.md -- Provider Error Taxonomy

Documents how Anthropic, OpenAI, and the Vercel AI SDK signal context-overflow errors. OpenAI has a stable context_length_exceeded code; Anthropic uses message-text regex (drift risk). Confirms 413 is NOT overflow (Cloudflare byte limit). Notes max_tokens counts toward Anthropic overflow math, and that AI SDK does not normalize overflow codes.

meta/_changelog.md -- Spec Process Log

Append-only record of the spec authoring process: initial creation, audit + challenger resolution, and finalization. Documents all corrections applied (class name fixes, injection-site corrections, middleware pseudocode rewrite, acceptance criteria rewrites) and decisions made during the audit/challenge cycle.

meta/audit-findings.md -- Spec Audit Report

10 findings (3 High, 4 Medium, 3 Low) from a systematic audit against the codebase at baseline commit. High findings: incorrect class naming (ConversationCompressor vs MidGenerationCompressor), wrong _oversizedWarning injection-site citations, and misidentified artifact-exclusion surgery target (buildInitialMessages doesn't handle artifacts). All were resolved in the spec before finalization. Includes a confirmed-claims section verifying all file citations and external references.

meta/design-challenge.md -- Adversarial Design Review

8 challenge items from a cold-reader stress test. Confirms the core architecture is sound. Key challenges addressed: strengthened DEC-06 rationale with multi-step step-N>1 argument, split artifact exclusion to use error-shaped tool results (not text stubs), added DEC-11 rejecting transformParams, added DEC-13 for per-request middleware construction, and documented retry-budget-ladder revisit trigger. Post-commit hard-fail (C4) was independently dismissed as correct.

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ Using Claude Opus ｜ 𝕏}

pullfrog · 2026-04-14T17:50:09Z

TL;DR

Adds a comprehensive spec (ready-for-implementation) for making mid-generation context compression reactive — firing only when the LLM provider actually returns a context-overflow error — instead of the current proactive token-budget prediction that over-eagerly compresses. Also specifies oversized tool-output exclusion at the tool-wrapper.ts toModelOutput seam so large tool results never reach the LLM prompt.

No code changes — this PR is 8 new spec/evidence/meta documents (1,375 lines).

What Changed (latest push)

Commit f0175d6 addresses PR review feedback from two bot reviewers (pullfrog, claude[bot]) — no design changes, only corrections and additions:

Version pin fix: @ai-sdk/[email protected] → 3.0.2 (the version actually declared in package.json; middleware contract verified identical across both)
Typo fix: formatOversizedReason → formatOversizedRetrievalReason (actual function name at artifact-utils.ts:50)
Telemetry clarification: all compression.* / anthropic_overflow_regex_hit / artifact.* attributes are explicitly span attributes (not log fields), emitted on step and tool-call spans
References cleanup: removed local filesystem path from §17; added in-repo pointers to meta/audit-findings.md and meta/design-challenge.md
New risk (MEDIUM): End-User visibility gap when tool results are excluded mid-turn — operators can see it in Jaeger via tool.result.oversized_excluded, but chat UI surfacing is deferred
New future work items: Traces UI surfacing for exclusion/overflow attributes, enriched stub metadata for debugging, retrieval-tool migration note
Verification plan addition: docs-check step to grep agents-docs/ for stale compression references during implementation
Changelog entry: full accounting of all review findings (accepted, partially accepted, none declined)

Key Changes

Introduces a new spec at specs/2026-04-14-reactive-compression/SPEC.md with 14 locked/directed decisions, 8 requirements with acceptance criteria, and a full verification plan
Defines a compressionRetryMiddleware using AI SDK wrapLanguageModel to intercept doGenerate/doStream per-step and retry once on provider overflow errors (Anthropic + OpenAI)
Specifies removal of the proactive token-budget compression branch from the prepareStep callback in ai-sdk-callbacks.ts
Specifies oversized tool-output exclusion at tool-wrapper.ts:111 using a structured error-shaped tool result matching the existing retrieval_blocked contract in default-tools.ts
Specifies removal of _oversizedWarning injection from BaseCompressor.ts and ArtifactService.ts (retaining metadata.isOversized as the durable signal)
4 evidence documents grounding the design in codebase audit findings
2 meta documents recording the audit/challenger process and changelog

Main Spec (`SPEC.md`)

The core document covers the full lifecycle: problem statement (SCR format), goals/non-goals, current vs target state, vertical-slice architecture diagram, 6 detailed design sections, user journeys for 3 personas, blast radius analysis, 8 requirements (R1–R8) with verifiable acceptance criteria, a 14-entry decision log, open questions, assumptions, risks, verification plan, future work, and agent constraints (scope/exclude/stop-if/ask-first).

Two new files are specified: compressionRetryMiddleware.ts (per-request factory returning LanguageModelV2Middleware) and detectContextOverflow.ts (provider-aware overflow discriminator). Seven existing files are modified. Pre-generation conversation-history compression is explicitly unchanged and acts as a safety net.

Evidence: Current Compression Triggers

Maps the existing compression trigger points — confirms mid-gen compression is fully proactive via MidGenerationCompressor.isCompressionNeededFromActualUsage(), fires at 75–91% of context depending on model tier, and that no provider-error recovery exists around generateText/streamText.

Evidence: Middleware Approach

Documents why wrapLanguageModel middleware is the only correct interception layer: retry is invisible to the multi-step loop, stream peek via async-iterator .next() adds negligible latency, and post-commit mid-stream errors cannot be retried.

Evidence: Oversized Artifact Handling

Traces the artifact-to-prompt path: detectOversizedArtifact only detects (returns warning string), _oversizedWarning is injected downstream at BaseCompressor.ts:407 and ArtifactService.ts:751, and the actual LLM-facing seam is tool-wrapper.ts:111's toModelOutput hook — not buildInitialMessages or buildToolResultForModelInput.

Evidence: Provider Overflow Signals

Documents how Anthropic (message-regex-only, no stable code) and OpenAI (context_length_exceeded code) signal overflow, that 413 is a byte-limit not context-overflow, and that the AI SDK normalizes errors but not overflow codes.

Meta: Audit Findings

10 findings (3 High, 4 Medium, 3 Low) from a systematic audit against the baseline commit. High findings corrected class name confusion (MidGenerationCompressor vs ConversationCompressor), _oversizedWarning injection-site misattribution, and the artifact-exclusion surgery target (tool-wrapper.ts:111 not buildInitialMessages). All findings resolved in the final spec.

Meta: Design Challenge

8-item challenger review stress-testing locked decisions. Confirms the middleware architecture is sound. Key outcomes: DEC-06 rationale strengthened (multi-step step-N>1 argument), DEC-11 added (explicitly rejecting transformParams), DEC-13 added (per-request middleware construction), and a future-work trigger for retry-budget ladder based on second_overflow rate.

Meta: Changelog

Append-only record of the spec process: initial creation, audit + challenger resolution (all corrections applied), finalization to ready-for-implementation, and PR review-cycle corrections with full finding-by-finding accounting.

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ Using Claude Opus ｜ 𝕏}

pullfrog

Verdict: Approve. This is an exceptionally thorough spec with rigorous evidence tracing, a self-contained audit/challenge cycle, and verified file:line citations (9/9 checked against main HEAD are accurate; the 10th — @ai-sdk/provider types — is unverifiable without installed deps but the spec's own audit resolved that concern via C6/C8).

The spec covers a well-scoped refactor (reactive mid-gen compression + artifact exclusion) with 14 locked decisions, 8 verifiable requirements, clear SCOPE/EXCLUDE/STOP_IF/ASK_FIRST boundaries, and a changelog showing how audit findings were resolved. The design is sound — middleware-based reactive retry preserves the AI SDK multi-step loop, and the tool-wrapper.ts:111 toModelOutput seam is the correct surgery site for oversized exclusion.

Two minor issues noted inline — neither blocks implementation.

^{｜ View workflow run ｜ Using Claude Opus ｜ 𝕏}

pullfrog · 2026-04-14T17:50:42Z

+
+## 17. References
+
+- Plan file: `/Users/timothycardona/.claude/plans/rustling-prancing-beaver.md`


This references a local filesystem path (/Users/timothycardona/.claude/plans/rustling-prancing-beaver.md) which is not accessible to other contributors or the implementation agent. Consider removing or replacing with a relative path to an in-repo artifact if the plan content is valuable, or simply delete the line — the spec is self-contained.

pullfrog · 2026-04-14T17:50:42Z

+- `compression.detector = 'openai_code' | 'anthropic_regex' | 'heuristic_400'`
+- `compression.retry_number = 0 | 1`
+- `compression.outcome = 'success' | 'second_overflow' | 'other_error'`
+- `artifact.excluded = true` with `artifact.id`, `artifact.original_tokens`, `artifact.context_window`


anthropic_overflow_regex_hit = true is listed as a separate OTel attribute in §6.5, but R8's acceptance criteria in §9 list it as anthropic_overflow_regex_hit without specifying whether it's a span attribute or log field. The telemetry section says "span attributes / log fields" (ambiguously). For implementation clarity, pick one — span attribute is more queryable and consistent with the other compression.* attributes.

pullfrog

Approving — spec is thorough, well-structured, and ready for implementation. Two minor nits left as inline comments on the prior review.

^{｜ View workflow run ｜ Using Claude Opus ｜ 𝕏}

claude

PR Review Summary

(4) Total Issues | Risk: Medium

🟠⚠️ Major (2) 🟠⚠️

Inline Comments:

🟠 Major: SPEC.md:156 Version mismatch — spec cites @ai-sdk/[email protected] but codebase has 3.0.2
🟠 Major: SPEC.md:227 Oversized stub messaging is LLM-centric, not Agent Builder-debuggable

🟡 Minor (2) 🟡

Inline Comments:

🟡 Minor: SPEC.md:253 Telemetry attributes lack corresponding Traces UI treatment
🟡 Minor: SPEC.md:412 Future artifact retrieval tool would require stub wording migration

💭 Consider (2) 💭

💭 1) SPEC.md:261 End User visibility when tool results are excluded

Issue: User Journey 7.1 describes LLM "correcting on the next step" but doesn't specify observable feedback to End Users when oversized tool results are excluded.

Why: End Users have no visibility that a tool result was excluded. The agent may produce confusing output like "I was unable to retrieve that data" without context. This is a potential coherence gap.

Fix: Consider specifying an observable signal for End Users (status update, streaming indicator) or document this as a known UX gap in the Risks section.

Refs: SPEC.md §7.1

💭 2) SPEC.md:47 Docs accompaniment for behavioral change

Issue: Spec changes observable agent behavior (compression triggers reactively, oversized tool results excluded) but doesn't mention docs updates.

Why: Agent Builders reading current docs won't learn about the new oversized exclusion behavior or changed compression timing. The Blast Radius section notes "Public SDK / API: No change" but docs surface isn't addressed.

Fix: Add to Verification Plan: "Review and update agents-docs content if it references compression behavior. Add changelog entry for Agent Builders."

Refs: SPEC.md §8

💡 APPROVE WITH SUGGESTIONS

Summary: This is a well-structured, comprehensive spec with thorough evidence files and a complete audit trail. The core architecture (reactive middleware + oversized exclusion at tool-wrapper seam) is sound and well-reasoned.

The main actionable item is the version mismatch — the spec cites @ai-sdk/[email protected] but the codebase has 3.0.2. This should be reconciled before implementation begins to ensure the middleware contract is valid.

The other findings are product polish suggestions (debugging experience, UI treatment, docs) that can be addressed in the implementation PR or follow-ups.

Discarded (4)

Location	Issue	Reason Discarded
`evidence/provider-overflow-signals.md:30`	OpenAI error code documentation	Confirmed accurate — matches existing codebase usage at `distill-utils.ts:102`
`evidence/provider-overflow-signals.md:32-36`	Anthropic overflow message patterns from community evidence	Already addressed — spec appropriately flags as Assumption A1 with MEDIUM confidence and includes telemetry for drift detection
`SPEC.md:345`	DEC-11 rejection of transformParams	Confirmed correct — transformParams is proactive, conflicts with reactive design
Internal spec docs scope	Review against write-docs standards	Not applicable — `specs/` directory contains internal engineering specs, not customer-facing docs

Reviewers (3)

Reviewer	Returned	Consider	Inline Comments	Discarded
`pr-review-product`	5	2	2	1
`pr-review-3p-specs`	6	0	2	4
`pr-review-docs`	1	0	0	1
Total	12	2	4	6

claude · 2026-04-14T17:51:42Z

+
+### 6.2 Middleware (`compressionRetryMiddleware.ts`)
+
+**Critical shape correction.** Per `@ai-sdk/[email protected]` (`index.d.ts:2066-2088`), `wrapGenerate` / `wrapStream` receive `doGenerate` / `doStream` as **nullary** functions — they bind the already-transformed params. To retry with modified params, call `options.model.doGenerate(modifiedOptions)` (or `options.model.doStream(modifiedOptions)`) directly.


🟠 MAJOR: Version mismatch with installed @ai-sdk/provider

Issue: Spec cites @ai-sdk/[email protected] but the codebase has @ai-sdk/[email protected] installed.

Why: The middleware contract (wrapGenerate/wrapStream signature with options.model for retry) must be verified against the actual installed version. If 3.0.4 introduced API changes not present in 3.0.2, the implementation will fail.

Fix: Update the spec to cite @ai-sdk/[email protected] after verifying the middleware interface matches, OR update the package to 3.0.4 if that version is required for the proposed pattern.

Refs:

packages/agents-core/package.json:177 — "@ai-sdk/provider": "3.0.2"

packages/ai-sdk-provider/package.json:39 — same

claude · 2026-04-14T17:51:50Z

+        toolCallId: currentToolCallId,
+        toolName,
+        warning: '⚠️ Tool produced an oversized result that cannot be included in the conversation.',
+        reason: formatOversizedReason(detection.originalTokenSize, detection.contextWindowSize),


🟠 MAJOR: Oversized stub messaging is LLM-centric, not Agent Builder-debuggable

Issue: The structured error stub includes reason: formatOversizedReason(...) but the format isn't specified. Agent Builders debugging in Traces UI will see raw token counts without context on the 30% threshold or how to resolve.

Why: First-contact legibility for Agent Builders matters. The existing retrieval_blocked response at default-tools.ts:225-245 has the same gap. Since this spec aims to improve coherence, the stub should help humans debug, not just LLMs retry.

Fix: Consider specifying:

Human-readable token context: ~42K tokens (30% of 140K context limit)

A docsUrl field pointing to troubleshooting docs

Truncation for toolArgs to avoid multi-KB dumps in Traces UI

Refs:

default-tools.ts:225-245 — existing retrieval_blocked pattern

claude · 2026-04-14T17:52:00Z

+- `compression.detector = 'openai_code' | 'anthropic_regex' | 'heuristic_400'`
+- `compression.retry_number = 0 | 1`
+- `compression.outcome = 'success' | 'second_overflow' | 'other_error'`
+- `artifact.excluded = true` with `artifact.id`, `artifact.original_tokens`, `artifact.context_window`


🟡 Minor: Telemetry attributes lack corresponding Traces UI treatment

Issue: New telemetry attributes (artifact.excluded, artifact.id, artifact.original_tokens, etc.) are specified, but there's no mention of whether the Traces Dashboard will surface them.

Why: Agent Builders need to see which tool calls had results excluded. Without UI treatment, these attributes are only discoverable via raw Jaeger queries.

Fix: Add to §8 (Blast Radius) or Future Work: "Traces UI should display artifact.excluded attribute on tool-call spans; consider a visual indicator (warning icon) in Tool Calls View."

Refs:

SPEC.md §6.5 — telemetry spec

claude · 2026-04-14T17:52:05Z

+
+| Item | Tier | Why not now | Triggers to revisit |
+|---|---|---|---|
+| Agent-callable artifact retrieval tool (e.g., `get_artifact_full(id)`) | **Explored** | Out of this spec's scope; stub informs the LLM content is unavailable without advertising a retrieval path. Designing a safe retrieval API (permissions, token budget of retrieved content, recursion prevention) is its own scope. | Agent builders report repeated frustration with "content unavailable"; product decision to support in-turn full-artifact access. |


🟡 Minor: Future artifact retrieval tool would require stub wording migration

Issue: DEC-03 locks "stub advertises no retrieval path," but if an artifact retrieval tool is added later, the recommendation field wording becomes outdated. This creates mild product debt.

Why: The stub wording is a de-facto API contract. The spec notes this as "Explored" but doesn't specify the migration story.

Fix: Add a note: "If an artifact retrieval tool is added, the oversized stub recommendation field should be updated to include it (e.g., 'retrieve the full artifact using get_artifact_full(id)'). This is a backward-compatible addition."

github-actions · 2026-04-14T17:56:41Z

Preview URLs

Use these stable preview aliases for testing this PR:

UI: https://pr-3129-ui.preview.inkeep.com
API: https://pr-3129-api.preview.inkeep.com
API health: https://pr-3129-api.preview.inkeep.com/health

These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find.

Raw Vercel deployment URLs

UI deployment: https://agents-manage-ouz7q1hya.preview.inkeep.com
API deployment: https://agents-ibhxuoqvy.preview.inkeep.com

- Correct @ai-sdk/provider version citation: 3.0.4 → 3.0.2 (verified middleware contract identical across both via dist/index.d.ts diff) - Fix formatOversizedReason → formatOversizedRetrievalReason (actual function exported from artifact-utils.ts:50) - Clarify R8 telemetry: all attributes are span attributes, not log fields - Remove local filesystem path from §17 References - Add End-User visibility gap to Risks (MEDIUM) with operator mitigation - Add Traces UI surfacing, enriched stub metadata, and retrieval-tool migration note to Future Work - Add docs-check step to Verification Plan No design changes. All review findings addressed. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

claude

PR Review Summary

(0) Total Issues | Risk: Low

Re-review Scope

This is a delta-scoped re-review of commit f0175d6a2 ("spec: address PR review feedback (version pin, typos, telemetry clarity)").

Prior Feedback Status

All 9 findings from the previous review cycle have been properly addressed:

Prior Finding	Resolution
🟠 Version mismatch (`@ai-sdk/[email protected]` → `3.0.2`)	✅ §6.2 updated to cite `3.0.2` with verification note that 3.0.4 has identical middleware contract
🟠 Function name typo (`formatOversizedReason`)	✅ Corrected to `formatOversizedRetrievalReason` globally
🟡 Telemetry ambiguity (span vs log fields)	✅ §6.5 clarified: all are span attributes (not log fields)
🟡 Future artifact retrieval migration story	✅ Migration note added to §15 Future Work entry
💭 End-User visibility gap	✅ New row added to §13 Risks
💭 Docs accompaniment	✅ Docs check added to §14 Verification Plan
pullfrog: Local filesystem path in References	✅ Removed; replaced with in-repo pointers
pullfrog: R8 telemetry ambiguity	✅ Same as above — clarified as span attributes
(secondary) Enriched stub metadata	✅ Added to §15 Future Work (Noted tier)

Changelog Quality

The meta/_changelog.md update provides excellent traceability:

Each finding is categorized (Accepted / Partially accepted / Declined)
Verification evidence is documented (e.g., diff of dist/index.d.ts between versions)
Rationale for partial acceptance is clear (e.g., "added to Risks / Future Work rather than design changes")

✅ APPROVE

Summary: The spec is ready for implementation. All prior review feedback has been thoughtfully addressed — corrections were applied where warranted, and items outside the spec's scope were properly logged in Risks or Future Work sections. The changelog audit trail is exemplary.

Note: GitHub App cannot submit formal approvals. This is a recommendation to approve.

Reviewers (1)

Reviewer	Returned	Main Findings	Consider	While You're Here	Inline Comments	Pending Recs	Discarded
`orchestrator`	0	0	0	0	0	0	0
Total	0	0	0	0	0	0	0

Note: Delta-scoped review — no sub-reviewers dispatched as prior feedback was fully addressed.

vercel Bot deployed to Preview – agents-docs April 14, 2026 17:49 View deployment

vercel Bot deployed to Preview – agents-manage-ui April 14, 2026 17:49 View deployment

vercel Bot deployed to Preview – agents-api April 14, 2026 17:49 View deployment

pullfrog Bot reviewed Apr 14, 2026

View reviewed changes

claude Bot reviewed Apr 14, 2026

View reviewed changes

github-actions Bot deleted a comment from claude Bot Apr 14, 2026

vercel Bot deployed to Preview – agents-manage-ui April 14, 2026 18:41 View deployment

vercel Bot deployed to Preview – agents-docs April 14, 2026 18:41 View deployment

claude Bot reviewed Apr 14, 2026

View reviewed changes

github-actions Bot deleted a comment from claude Bot Apr 14, 2026

vercel Bot deployed to Preview – agents-api April 14, 2026 18:43 View deployment

tim-inkeep mentioned this pull request Apr 14, 2026

feat: reactive mid-generation compression + artifact exclusion #3133

Open

7 tasks

shagun-singh-inkeep enabled auto-merge (squash) April 15, 2026 19:33

shagun-singh-inkeep disabled auto-merge April 15, 2026 19:33


		## 17. References

		- Plan file: `/Users/timothycardona/.claude/plans/rustling-prancing-beaver.md`


		### 6.2 Middleware (`compressionRetryMiddleware.ts`)

		Critical shape correction. Per `@ai-sdk/[email protected]` (`index.d.ts:2066-2088`), `wrapGenerate` / `wrapStream` receive `doGenerate` / `doStream` as nullary functions — they bind the already-transformed params. To retry with modified params, call `options.model.doGenerate(modifiedOptions)` (or `options.model.doStream(modifiedOptions)`) directly.

Conversation

tim-inkeep commented Apr 14, 2026

Summary

What changes (per the spec)

Spec artifacts

Key design decisions

Test plan

Uh oh!

vercel Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

pullfrog Bot commented Apr 14, 2026

TL;DR

Key Changes

SPEC.md -- Main Specification

evidence/current-compression-triggers.md -- Compression Trigger Audit

evidence/middleware-approach.md -- Why Middleware Works

evidence/oversized-artifact-handling.md -- Artifact Path Analysis

evidence/provider-overflow-signals.md -- Provider Error Taxonomy

meta/_changelog.md -- Spec Process Log

meta/audit-findings.md -- Spec Audit Report

meta/design-challenge.md -- Adversarial Design Review

Uh oh!

pullfrog Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TL;DR

What Changed (latest push)

Key Changes

Main Spec (SPEC.md)

Evidence: Current Compression Triggers

Evidence: Middleware Approach

Evidence: Oversized Artifact Handling

Evidence: Provider Overflow Signals

Meta: Audit Findings

Meta: Design Challenge

Meta: Changelog

Uh oh!

pullfrog Bot left a comment

Choose a reason for hiding this comment

Uh oh!

pullfrog Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog Bot left a comment

Choose a reason for hiding this comment

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

PR Review Summary

🟠⚠️ Major (2) 🟠⚠️

🟡 Minor (2) 🟡

💭 Consider (2) 💭

💡 APPROVE WITH SUGGESTIONS

Uh oh!

claude Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Preview URLs

Uh oh!

claude Bot left a comment

vercel Bot commented Apr 14, 2026 •

edited

Loading

changeset-bot Bot commented Apr 14, 2026 •

edited

Loading

pullfrog Bot commented Apr 14, 2026 •

edited

Loading

Main Spec (`SPEC.md`)

github-actions Bot commented Apr 14, 2026 •

edited

Loading