Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Claude finished @nick-inkeep's task —— View job Code Review: PR #198 - Docs Simplify AuthI'll provide a comprehensive TypeScript Staff Engineer and System Architect review of this authentication simplification. Todo List:
Analysis SummaryThis PR appears to be the initial commit of the Inkeep Agent Framework with documentation focused on "simplifying auth". The codebase includes sophisticated multi-API authentication patterns across two main services (Run API and Manage API) with comprehensive documentation. Comprehensive Code Review🏗️ System Architecture: Authentication DesignStrengths:
Architectural Concerns: 1. Authentication Complexity vs "Simplification"
|
…e + preview paths (inkeep#198) (inkeep#3172) * fix(ci): close remaining silent strandings in release cascade + bridge + preview paths Bundle of 10 code + 1 doc changes, each closing a distinct silent-failure mode where CI would pass green (or sit red invisibly) while production work stranded. Complement to inkeep#194's release-cascade hardening — each fix here addresses a gap that pass missed. Release cascade (5) - release-handler.yml: pin GitHub Release to client_payload.commit_sha, not origin/main. Auto-format landing between sync and handler used to create a SHA drift where Vercel deployed a newer commit than npm got. - release-handler.yml: 3-attempt bash retry around gh release view/edit/ create. Mirrors the dispatch-side retry from inkeep#174 — one transient 5xx used to leave npm published with no Release (so no Vercel deploy) and no alert. - release-handler.yml: notify-on-failure job. If the success job itself errors (app-token, checkout, or handler logic), the dispatch was already delivered and public-side retries don't re-fire. Without this notifier, npm publishes but the cascade stops silently. - public-mirror-sync.yml: 3-attempt retry on sync PR approval. One dropped GH API call used to leave the sync PR open with no approval; now exhaustion exits non-zero so the workflow turns red and the janitor's sync-PR sweep (below) catches long-open PRs as a safety net. - scripts/check-monorepo-traps.mjs: flip check:override-masks-bump from warn to hard-fail. Current tree is clean of masks, so flipping is safe and prevents future PR inkeep#170-class silent divergence. Preview envs (2) - deploy-vercel-preview.sh: validate captured deployment URL matches *.vercel.app pattern before alias-set. The log-grep fallback could silently capture a docs/inspection URL and alias api.preview.inkeep.com to the wrong place. - public-agents-preview-environments.yml: add teardown-failure-notify job. Railway and Vercel teardowns run in parallel by design; this closes the observability gap when either silently fails — Railway continues billing, Vercel env vars pollute, and the next 6h janitor sweep re-attempts cleanup. Dependency sync + bridge (2) - dependabot-sync-root-lockfile.yml: on push failure, full-reset onto origin/$HEAD_REF and re-regenerate both lockfiles (root + public/agents) from the new baseline. Previous simple rebase would replay our stale lockfile on top of a different package.json after a Dependabot force-push, yielding inconsistent committed state that either broke --frozen-lockfile or silently shipped drift into main. - bridge-public-pr-to-monorepo.mjs (both copies): truncate bridged PR body to fit GitHub's 65,536-char limit, with a link to the original PR. Was failing 422 "body is too long" on Dependabot mega-bumps, stranding those PRs outside the agents-private review surface. Janitor (1) - public-agents-preview-janitor.yml: new sweep-stale-sync-prs job. Every 6h, identify open copybara/sync PRs on inkeep/agents >4h old, open (idempotent) tracking issue on agents-private, re-dispatch public-mirror-sync.yml. Does NOT auto-close the PR — destructive actions stay human-driven. Docs (1) - CI_RUNBOOK.md: partial-npm-publish recovery path. Documents what to do when pnpm changeset publish fails on package N of 10 (some live, some not): confirm via npm view, re-run the failed job (publish is idempotent), verify all 10 present before unblocking cascade. Each change verified in isolation with pnpm check:structural; YAML validated with safe_load; bridge truncation logic tested at boundary. * fix(ci): add checkout step and remove heredoc indentation in release-handler notify Addresses review feedback on PR inkeep#198: - notify-success-failure now does a Checkout before gh calls, matching the failure job pattern (without it, gh CLI lacks git context and issue-create would silently fail with 'fatal: not a git repository'). - Body is now built with echo into a temp file instead of a heredoc. The original 10-space indent inside run: | would have been preserved literally by bash, making the tracking issue body unreadable. * fix(ci): address review feedback on release-handler retry + docs drift Review feedback from inkeep#198 (Claude + Pullfrog): release-handler.yml: - Capture stderr in gh_with_retry so on-call sees the actual API error (rate-limit, 5xx, auth) instead of a generic 'command failed' line. Was making incident triage 15-30m slower. (Claude Major) - Stop wrapping 'gh release view' in gh_with_retry. It returns non-zero for the legitimate 'release doesn't exist' case (the normal new-release happy path); retrying burned 15s of backoff before falling through to create. Only mutating calls (edit/create) need retry. (Pullfrog + Claude) - notify-success-failure now fails loudly (exit 1) if gh issue create fails, matching the failure job's pattern. Silent swallow would mean the success handler failed AND the notifier failed with no visible signal beyond a red check — exactly the class this PR closes elsewhere. (Claude Major) Docs drift (Pullfrog): - AGENTS.md: remove '(soft today)' / 'non-blocking today' annotations from override-masks-bump in both the cheatsheet and the trap-list. - AGENTS.md trap inkeep#4: flip 'Currently a warning, not a hard fail' to 'Hard-fail (flipped...)' to match the code. - CI_RUNBOOK.md: rename the override-masks-bump section header from 'warns (non-blocking today)' to 'fails — root override masks a workspace bump'. Preview janitor (Claude Consider): - Document that sweep-stale-sync-prs runs unconditionally regardless of PREVIEW_ENVIRONMENTS_ENABLED. Pr body said so; workflow didn't. Bridge script (Claude Consider): - console.log when PR body is truncated, with original + new sizes. Helps CI log debugging when a bridged PR looks shorter than expected. Applied to both copies (public/agents + public/agents-optional-local-dev). Intentionally not addressed: - dependabot-sync $MSG staleness after reset: Pullfrog flagged as intentional (message describes the operation, not the specific baseline), Claude flagged as cosmetic minor. Siding with Pullfrog. - Preview teardown notify swallowed error: lower severity given the janitor re-attempts cleanup every 6h; accepting current pattern. * fix(ci): pre-fetch PR base into agents-private before git apply --3way Closes the bridge-failure class seen on inkeep#3171 (and eventually on any bridged PR with a conflicting hunk). Root cause: `git apply --index --3way` resolves the patch's `index <old>..<new>` blob SHAs against the local repo's object store. The SHAs come from inkeep/agents' object graph; agents-private has never fetched from inkeep/agents, so those SHAs aren't resolvable. Clean patches work (no --3way fallback needed); any conflict triggers --3way, which fails with 'repository lacks the necessary blob to perform 3-way merge' — bridge stops with no useful diagnostic, PR strands outside the canonical review surface. Fix: Before calling git apply, shallow-fetch the PR's base commit from the public repo into agents-private's object store. GitHub permits fetching by SHA when the SHA is reachable from a ref (PR base is on main, always reachable), and shallow=1 keeps it cheap. After fetch, --3way can find the `<old>` blob and resolve conflicts into merge markers instead of a hard error. Why not drop --3way: Dropping --3way makes every conflicting PR fail hard with no recovery path. Pre-fetching preserves the --3way fallback so conflicting hunks produce visible conflict markers in the bridged PR, which is reviewable rather than opaque. Failure handling: If the fetch itself fails (rate-limit, network flake), we log and proceed. The subsequent git apply will hit the same blob error as before the fix — no regression, just no improvement for that run. Applied to both script copies: public/agents and public/agents-optional-local-dev. Bundled into PR inkeep#198 since it's the same release/sync hardening scope. * Revert "fix(ci): pre-fetch PR base into agents-private before git apply --3way" This reverts commit e7e0640e2345bbd7cdea1185eb3abc666f2c60fa. * fix(ci): switch bridge patch fetch from .patch to .diff media type Root cause of bridge failures on multi-commit PRs (eg inkeep#3171, 9 commits): `gh api .../pulls/N Accept: application/vnd.github.patch` returns a mailbox-format patch series — one patch per PR commit, each with `index <old>..<new>` lines referencing INTERMEDIATE blob SHAs created during the PR's history on inkeep/agents. Those intermediates exist only in inkeep/agents' object store; agents-private has never seen them. The first patch applies against agents-private's file (blob content matches the PR base even if SHAs differ), git's index advances to the intermediate state (only referenced by SHA, never fetched as an object). Any subsequent conflicting hunk triggers `git apply --3way` fallback, which needs the intermediate blob → 'repository lacks the necessary blob' → stranded PR. Single-commit PRs don't hit this (no intermediates). That's why the bridge used to work for most Dependabot bumps but started failing on longer branches like inkeep#3171. Fix: switch `Accept: application/vnd.github.patch` → `Accept: application/vnd.github.diff` in both bridge-public-pr-to- monorepo.mjs copies. .diff returns a single consolidated base-vs-head unified diff with only the PR's base blob SHAs referenced. Those are content-identical to agents-private's copies (Copybara mirrors blobs 1:1 without content rewrites for .ts files), so --3way can resolve them locally from agents-private's own object store. Also updated githubRequest's text-response check to include .diff alongside .patch (both are text, not JSON). Previous pre-fetch commit (`e7e0640e2`) was reverted in a prior commit — it was addressing the wrong layer (only PR-base blob, not intermediates). This commit supersedes that approach entirely. * chore(ci): biome format fix on bridge script console.log GitOrigin-RevId: a26d5dac74ed9938fc512703af21a25d82616504 Co-authored-by: Varun Varahabhotla <[email protected]>
No description provided.