changes to instrumentation#150
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Claude finished @shagun-singh-inkeep's task —— View job PR Review: Instrumentation ChangesTodo List:
SummaryThis PR refactors the OpenTelemetry instrumentation setup in Key Changes Analyzed✅ Positive Changes
|
bab6262 to
17cb307
Compare
|
Claude finished @shagun-singh-inkeep's task —— View job Claude finished @shagun-singh-inkeep's task —— View job PR Review: Instrumentation ChangesTodo List:
SummaryThis PR refactors the OpenTelemetry instrumentation setup in Key Changes Analyzed✅ Positive Changes
|
* Fix Public Mirror Sync: auto-approve sync PRs + handle stuck PRs
Problem observed today: every push to main for the last ~18 hours failed
the Public Mirror Sync workflow with
Failed to push to inkeep/agents [HEAD:refs/heads/copybara/sync],
because local/origin history is behind destination
Root cause is two-part:
1. inkeep/agents `main` branch protection was tightened to require one
approval on every PR (require_last_push_approval=true). Sync PRs
authored by the inkeep-oss-sync app are never approved, so they sit
BLOCKED with auto-merge enabled but no approver. Bypass actor status
on the ruleset does NOT propagate through GitHub's auto-merge path.
2. Once one sync PR is stuck open, copybara on every subsequent agents-
private push tries to push new commits to `copybara/sync` but the
branch has diverging history from our local generation, so the push
is rejected. The existing cleanup step explicitly leaves the branch
alone while a sync PR is open, so the workflow wedges indefinitely.
Fix:
- Add `public/agents/.github/workflows/auto-approve-sync.yml`. Copybara
mirrors files under `public/agents/**` into inkeep/agents, so this
workflow ends up at `.github/workflows/auto-approve-sync.yml` on the
public repo. It listens on pull_request_target for PRs authored by
inkeep-oss-sync[bot] against the copybara/sync head ref and posts an
approving review from github-actions[bot]. Handles the synchronize
event too so the approval is refreshed after every copybara push
(require_last_push_approval dismisses prior approvals on new commits).
- Rework the "Clean up stale copybara/sync branch" step into a proper
reconciliation step. Three cases now handled:
(a) previous sync PR merged (squashed) — delete the stale branch
(b) open sync PR older than STALE_PR_HOURS (default 2h) — close
the PR and delete the branch so this run opens a fresh one
with all accumulated changes
(c) open sync PR recent — leave as-is; copybara appends commits
Staff-engineer framing: (1) fixes the approval gap at the source
(inkeep/agents) so no human is in the loop for routine syncs, while
(2) ensures the agents-private sync workflow self-heals if the approval
path ever breaks again (bot outage, policy tightening, etc.) instead
of silently wedging.
Bootstrap: PR #3150 (the currently-stuck sync PR) was manually approved
and enqueued to unblock today. Once this PR merges to agents-private
and the next sync lands the auto-approve workflow on inkeep/agents,
subsequent syncs will no longer need manual approval.
* Address review: harden reconcile error handling, elevate warnings, add timeout
Pullfrog + Claude bot review on #150:
- Separate "gh pr list failed" from "no open PR found" in the reconcile
step. A silent API failure previously fell through to the branch-
existence block and could delete the branch of a live PR. Now we exit
early with ::warning:: and skip reconciliation for this run (Copybara
may still succeed if the branch state is clean).
- Remove the `exit 0` after closing a stale PR so execution falls through
to the branch-existence fallback. If the DELETE of copybara/sync fails
on the primary path, the fallback block gets a second attempt.
- Elevate all cleanup/failure `|| echo` fallbacks to ::warning::
annotations so operators see them in the workflow summary instead of
buried in stdout.
- Replace stale comment on the `gh pr merge` call. The oss-sync app's
bypass-actor status does NOT propagate through auto-merge (that's the
whole reason auto-approve-sync.yml now exists). Document the real
mechanism.
- Add `timeout-minutes: 5` to the approve job in auto-approve-sync.yml.
GitOrigin-RevId: 9dcc4f2f123f531a6c303cb8f43a4d096384c736
* Fix Public Mirror Sync: auto-approve sync PRs + handle stuck PRs
Problem observed today: every push to main for the last ~18 hours failed
the Public Mirror Sync workflow with
Failed to push to inkeep/agents [HEAD:refs/heads/copybara/sync],
because local/origin history is behind destination
Root cause is two-part:
1. inkeep/agents `main` branch protection was tightened to require one
approval on every PR (require_last_push_approval=true). Sync PRs
authored by the inkeep-oss-sync app are never approved, so they sit
BLOCKED with auto-merge enabled but no approver. Bypass actor status
on the ruleset does NOT propagate through GitHub's auto-merge path.
2. Once one sync PR is stuck open, copybara on every subsequent agents-
private push tries to push new commits to `copybara/sync` but the
branch has diverging history from our local generation, so the push
is rejected. The existing cleanup step explicitly leaves the branch
alone while a sync PR is open, so the workflow wedges indefinitely.
Fix:
- Add `public/agents/.github/workflows/auto-approve-sync.yml`. Copybara
mirrors files under `public/agents/**` into inkeep/agents, so this
workflow ends up at `.github/workflows/auto-approve-sync.yml` on the
public repo. It listens on pull_request_target for PRs authored by
inkeep-oss-sync[bot] against the copybara/sync head ref and posts an
approving review from github-actions[bot]. Handles the synchronize
event too so the approval is refreshed after every copybara push
(require_last_push_approval dismisses prior approvals on new commits).
- Rework the "Clean up stale copybara/sync branch" step into a proper
reconciliation step. Three cases now handled:
(a) previous sync PR merged (squashed) — delete the stale branch
(b) open sync PR older than STALE_PR_HOURS (default 2h) — close
the PR and delete the branch so this run opens a fresh one
with all accumulated changes
(c) open sync PR recent — leave as-is; copybara appends commits
Staff-engineer framing: (1) fixes the approval gap at the source
(inkeep/agents) so no human is in the loop for routine syncs, while
(2) ensures the agents-private sync workflow self-heals if the approval
path ever breaks again (bot outage, policy tightening, etc.) instead
of silently wedging.
Bootstrap: PR #3150 (the currently-stuck sync PR) was manually approved
and enqueued to unblock today. Once this PR merges to agents-private
and the next sync lands the auto-approve workflow on inkeep/agents,
subsequent syncs will no longer need manual approval.
* Address review: harden reconcile error handling, elevate warnings, add timeout
Pullfrog + Claude bot review on #150:
- Separate "gh pr list failed" from "no open PR found" in the reconcile
step. A silent API failure previously fell through to the branch-
existence block and could delete the branch of a live PR. Now we exit
early with ::warning:: and skip reconciliation for this run (Copybara
may still succeed if the branch state is clean).
- Remove the `exit 0` after closing a stale PR so execution falls through
to the branch-existence fallback. If the DELETE of copybara/sync fails
on the primary path, the fallback block gets a second attempt.
- Elevate all cleanup/failure `|| echo` fallbacks to ::warning::
annotations so operators see them in the workflow summary instead of
buried in stdout.
- Replace stale comment on the `gh pr merge` call. The oss-sync app's
bypass-actor status does NOT propagate through auto-merge (that's the
whole reason auto-approve-sync.yml now exists). Document the real
mechanism.
- Add `timeout-minutes: 5` to the approve job in auto-approve-sync.yml.
GitOrigin-RevId: 9dcc4f2f123f531a6c303cb8f43a4d096384c736
* Fix Public Mirror Sync: auto-approve sync PRs + handle stuck PRs
Problem observed today: every push to main for the last ~18 hours failed
the Public Mirror Sync workflow with
Failed to push to inkeep/agents [HEAD:refs/heads/copybara/sync],
because local/origin history is behind destination
Root cause is two-part:
1. inkeep/agents `main` branch protection was tightened to require one
approval on every PR (require_last_push_approval=true). Sync PRs
authored by the inkeep-oss-sync app are never approved, so they sit
BLOCKED with auto-merge enabled but no approver. Bypass actor status
on the ruleset does NOT propagate through GitHub's auto-merge path.
2. Once one sync PR is stuck open, copybara on every subsequent agents-
private push tries to push new commits to `copybara/sync` but the
branch has diverging history from our local generation, so the push
is rejected. The existing cleanup step explicitly leaves the branch
alone while a sync PR is open, so the workflow wedges indefinitely.
Fix:
- Add `public/agents/.github/workflows/auto-approve-sync.yml`. Copybara
mirrors files under `public/agents/**` into inkeep/agents, so this
workflow ends up at `.github/workflows/auto-approve-sync.yml` on the
public repo. It listens on pull_request_target for PRs authored by
inkeep-oss-sync[bot] against the copybara/sync head ref and posts an
approving review from github-actions[bot]. Handles the synchronize
event too so the approval is refreshed after every copybara push
(require_last_push_approval dismisses prior approvals on new commits).
- Rework the "Clean up stale copybara/sync branch" step into a proper
reconciliation step. Three cases now handled:
(a) previous sync PR merged (squashed) — delete the stale branch
(b) open sync PR older than STALE_PR_HOURS (default 2h) — close
the PR and delete the branch so this run opens a fresh one
with all accumulated changes
(c) open sync PR recent — leave as-is; copybara appends commits
Staff-engineer framing: (1) fixes the approval gap at the source
(inkeep/agents) so no human is in the loop for routine syncs, while
(2) ensures the agents-private sync workflow self-heals if the approval
path ever breaks again (bot outage, policy tightening, etc.) instead
of silently wedging.
Bootstrap: PR #3150 (the currently-stuck sync PR) was manually approved
and enqueued to unblock today. Once this PR merges to agents-private
and the next sync lands the auto-approve workflow on inkeep/agents,
subsequent syncs will no longer need manual approval.
* Address review: harden reconcile error handling, elevate warnings, add timeout
Pullfrog + Claude bot review on #150:
- Separate "gh pr list failed" from "no open PR found" in the reconcile
step. A silent API failure previously fell through to the branch-
existence block and could delete the branch of a live PR. Now we exit
early with ::warning:: and skip reconciliation for this run (Copybara
may still succeed if the branch state is clean).
- Remove the `exit 0` after closing a stale PR so execution falls through
to the branch-existence fallback. If the DELETE of copybara/sync fails
on the primary path, the fallback block gets a second attempt.
- Elevate all cleanup/failure `|| echo` fallbacks to ::warning::
annotations so operators see them in the workflow summary instead of
buried in stdout.
- Replace stale comment on the `gh pr merge` call. The oss-sync app's
bypass-actor status does NOT propagate through auto-merge (that's the
whole reason auto-approve-sync.yml now exists). Document the real
mechanism.
- Add `timeout-minutes: 5` to the approve job in auto-approve-sync.yml.
GitOrigin-RevId: 9dcc4f2f123f531a6c303cb8f43a4d096384c736
* Fix Public Mirror Sync: auto-approve sync PRs + handle stuck PRs
Problem observed today: every push to main for the last ~18 hours failed
the Public Mirror Sync workflow with
Failed to push to inkeep/agents [HEAD:refs/heads/copybara/sync],
because local/origin history is behind destination
Root cause is two-part:
1. inkeep/agents `main` branch protection was tightened to require one
approval on every PR (require_last_push_approval=true). Sync PRs
authored by the inkeep-oss-sync app are never approved, so they sit
BLOCKED with auto-merge enabled but no approver. Bypass actor status
on the ruleset does NOT propagate through GitHub's auto-merge path.
2. Once one sync PR is stuck open, copybara on every subsequent agents-
private push tries to push new commits to `copybara/sync` but the
branch has diverging history from our local generation, so the push
is rejected. The existing cleanup step explicitly leaves the branch
alone while a sync PR is open, so the workflow wedges indefinitely.
Fix:
- Add `public/agents/.github/workflows/auto-approve-sync.yml`. Copybara
mirrors files under `public/agents/**` into inkeep/agents, so this
workflow ends up at `.github/workflows/auto-approve-sync.yml` on the
public repo. It listens on pull_request_target for PRs authored by
inkeep-oss-sync[bot] against the copybara/sync head ref and posts an
approving review from github-actions[bot]. Handles the synchronize
event too so the approval is refreshed after every copybara push
(require_last_push_approval dismisses prior approvals on new commits).
- Rework the "Clean up stale copybara/sync branch" step into a proper
reconciliation step. Three cases now handled:
(a) previous sync PR merged (squashed) — delete the stale branch
(b) open sync PR older than STALE_PR_HOURS (default 2h) — close
the PR and delete the branch so this run opens a fresh one
with all accumulated changes
(c) open sync PR recent — leave as-is; copybara appends commits
Staff-engineer framing: (1) fixes the approval gap at the source
(inkeep/agents) so no human is in the loop for routine syncs, while
(2) ensures the agents-private sync workflow self-heals if the approval
path ever breaks again (bot outage, policy tightening, etc.) instead
of silently wedging.
Bootstrap: PR #3150 (the currently-stuck sync PR) was manually approved
and enqueued to unblock today. Once this PR merges to agents-private
and the next sync lands the auto-approve workflow on inkeep/agents,
subsequent syncs will no longer need manual approval.
* Address review: harden reconcile error handling, elevate warnings, add timeout
Pullfrog + Claude bot review on #150:
- Separate "gh pr list failed" from "no open PR found" in the reconcile
step. A silent API failure previously fell through to the branch-
existence block and could delete the branch of a live PR. Now we exit
early with ::warning:: and skip reconciliation for this run (Copybara
may still succeed if the branch state is clean).
- Remove the `exit 0` after closing a stale PR so execution falls through
to the branch-existence fallback. If the DELETE of copybara/sync fails
on the primary path, the fallback block gets a second attempt.
- Elevate all cleanup/failure `|| echo` fallbacks to ::warning::
annotations so operators see them in the workflow summary instead of
buried in stdout.
- Replace stale comment on the `gh pr merge` call. The oss-sync app's
bypass-actor status does NOT propagate through auto-merge (that's the
whole reason auto-approve-sync.yml now exists). Document the real
mechanism.
- Add `timeout-minutes: 5` to the approve job in auto-approve-sync.yml.
GitOrigin-RevId: 9dcc4f2f123f531a6c303cb8f43a4d096384c736
…#171) * fix: approve sync PRs from source side, drop target-repo auto-approve Moves copybara sync PR approval into public-mirror-sync.yml on agents-private using the INTERNAL_CI_APP token. Deletes the target-repo auto-approve-sync.yml workflow it replaces. Background PR #150 introduced auto-approve-sync.yml on the public tree to satisfy the required-approval rule that blocks copybara sync PRs on inkeep/agents. That design has a bootstrap problem: pull_request_target evaluates workflows from the target branch, so the very PR that first mirrors auto-approve-sync.yml to inkeep/agents/main cannot auto-approve itself. inkeep/agents PR #3157 is stuck in exactly that state, and because copybara/sync is blocked by that stuck PR, subsequent mirror runs on agents-private are failing at "Run agents mirror". Fix Approve from the source side instead. After public-mirror-sync.yml creates or updates the sync PR and enables auto-merge, a new step mints a second app token from INTERNAL_CI_APP (a different identity from inkeep-oss-sync, since GitHub blocks self-approval) and posts an approving review. The approval is scoped to the current head SHA and re-runs on every sync, which satisfies require_last_push_approval=true. - .github/workflows/public-mirror-sync.yml Adds Generate approver token step and Approve sync PR step. Skips re-approval when an approval already stands on the current head SHA. - public/agents/.github/workflows/auto-approve-sync.yml Deleted; replaced by the source-side step above. Prereq to verify once: INTERNAL_CI_APP is installed on inkeep/agents with Pull requests: Read and write. The app already powers Version Packages and release-handler paths that write to inkeep/agents, so the install is almost certainly already in place. * docs: update CI.md and runbook to reflect source-side sync approval Review feedback on #171: CI.md and CI_RUNBOOK.md still pointed at the deleted public/agents/.github/workflows/auto-approve-sync.yml. - CI.md: drop the row for auto-approve-sync.yml in the public workflows table; in the public/private parity table, replace the old row with one explaining approval now comes from public-mirror-sync.yml on the source side; update the OSS_SYNC_APP secrets row to drop "auto-approve sync" and add Copybara sync PR approval to INTERNAL_CI_APP usage. - CI_RUNBOOK.md: rewrite the "Sync PR on public repo is stuck open" entry to point operators at the Approve sync PR step and INTERNAL_CI_APP app installation instead of the deleted workflow. GitOrigin-RevId: 51edfa6a3a1e0863ba99d8498efad88c5dec1f06
* Delete duplicate Vercel workflows from public/agents (inkeep#152) These three files duplicated agents-private's root-level workflows: public/agents/.github/workflows/vercel-production.yml public/agents/.github/workflows/preview-environments.yml public/agents/.github/workflows/preview-janitor.yml Vercel projects (agents-api, agents-manage-ui, agents-docs) are owned and deployed from agents-private. Running duplicate workflows on the copybara-mirrored inkeep/agents repo means every sync fires BOTH repos' workflows against the same Vercel projects — wasted CI minutes at best, race-condition deploys at worst (both repos carry VERCEL_PROJECT_ID secrets targeting the same projects). The agents-private root-level replacements are: .github/workflows/public-agents-vercel-production.yml .github/workflows/public-agents-preview-environments.yml .github/workflows/public-agents-preview-janitor.yml Under the new release-split architecture (PR inkeep#144), the public mirror's only CI responsibility is the npm publish step in release.yml. Deleting these three keeps that scope clean. Copybara will propagate the deletions to inkeep/agents on next sync — the origin_files glob covers public/agents/** with no exclude for these filenames, so their disappearance from the source will delete them from the destination. GitOrigin-RevId: 2fa1f228c728dadf17bedbae88d776a11a8e127b * Fix Public Mirror Sync: auto-approve sync PRs + handle stuck PRs (inkeep#150) * Fix Public Mirror Sync: auto-approve sync PRs + handle stuck PRs Problem observed today: every push to main for the last ~18 hours failed the Public Mirror Sync workflow with Failed to push to inkeep/agents [HEAD:refs/heads/copybara/sync], because local/origin history is behind destination Root cause is two-part: 1. inkeep/agents `main` branch protection was tightened to require one approval on every PR (require_last_push_approval=true). Sync PRs authored by the inkeep-oss-sync app are never approved, so they sit BLOCKED with auto-merge enabled but no approver. Bypass actor status on the ruleset does NOT propagate through GitHub's auto-merge path. 2. Once one sync PR is stuck open, copybara on every subsequent agents- private push tries to push new commits to `copybara/sync` but the branch has diverging history from our local generation, so the push is rejected. The existing cleanup step explicitly leaves the branch alone while a sync PR is open, so the workflow wedges indefinitely. Fix: - Add `public/agents/.github/workflows/auto-approve-sync.yml`. Copybara mirrors files under `public/agents/**` into inkeep/agents, so this workflow ends up at `.github/workflows/auto-approve-sync.yml` on the public repo. It listens on pull_request_target for PRs authored by inkeep-oss-sync[bot] against the copybara/sync head ref and posts an approving review from github-actions[bot]. Handles the synchronize event too so the approval is refreshed after every copybara push (require_last_push_approval dismisses prior approvals on new commits). - Rework the "Clean up stale copybara/sync branch" step into a proper reconciliation step. Three cases now handled: (a) previous sync PR merged (squashed) — delete the stale branch (b) open sync PR older than STALE_PR_HOURS (default 2h) — close the PR and delete the branch so this run opens a fresh one with all accumulated changes (c) open sync PR recent — leave as-is; copybara appends commits Staff-engineer framing: (1) fixes the approval gap at the source (inkeep/agents) so no human is in the loop for routine syncs, while (2) ensures the agents-private sync workflow self-heals if the approval path ever breaks again (bot outage, policy tightening, etc.) instead of silently wedging. Bootstrap: PR inkeep#3150 (the currently-stuck sync PR) was manually approved and enqueued to unblock today. Once this PR merges to agents-private and the next sync lands the auto-approve workflow on inkeep/agents, subsequent syncs will no longer need manual approval. * Address review: harden reconcile error handling, elevate warnings, add timeout Pullfrog + Claude bot review on inkeep#150: - Separate "gh pr list failed" from "no open PR found" in the reconcile step. A silent API failure previously fell through to the branch- existence block and could delete the branch of a live PR. Now we exit early with ::warning:: and skip reconciliation for this run (Copybara may still succeed if the branch state is clean). - Remove the `exit 0` after closing a stale PR so execution falls through to the branch-existence fallback. If the DELETE of copybara/sync fails on the primary path, the fallback block gets a second attempt. - Elevate all cleanup/failure `|| echo` fallbacks to ::warning:: annotations so operators see them in the workflow summary instead of buried in stdout. - Replace stale comment on the `gh pr merge` call. The oss-sync app's bypass-actor status does NOT propagate through auto-merge (that's the whole reason auto-approve-sync.yml now exists). Document the real mechanism. - Add `timeout-minutes: 5` to the approve job in auto-approve-sync.yml. GitOrigin-RevId: 9dcc4f2f123f531a6c303cb8f43a4d096384c736 * fix(create-agents-template): narrow postinstall VERCEL guard to opt-in env var (inkeep#140) Same class of bug as inkeep#137 (chat-to-edit postinstall), but shipped in the customer-facing template. Every downstream user who scaffolds this template and deploys to Vercel hits the same ambient postinstall landmine. Background: create-agents-template/scripts/postinstall.ts ran if (process.env.VERCEL === '1' && !skip) { execSync('inkeep dev --export --output-dir ./apps/manage-ui', ...) execSync('pnpm -C apps/manage-ui install ...', ...) } The gate `VERCEL === '1'` is true on every Vercel build anywhere. Today the template only has one Vercel target (apps/agents-api), so the gate happened to correlate with the intended context. But: 1. Any customer who adds a second Vercel project targeting a different app (e.g. apps/mcp) silently fires the export, which assumes ./apps/manage-ui exists and `inkeep` is on PATH - neither is guaranteed for non-agents-api deploys. 2. Any Vercel deploy that installs with --production skips devDeps. `@inkeep/agents-cli` (which provides the `inkeep` CLI) is a devDep, so the postinstall would crash. 3. Replicates the exact failure mode that broke inkeep-cloud-mcp production in the main monorepo (inkeep#137). Fix: narrow to an explicit opt-in env var INKEEP_QUICKSTART_EXPORT=1 set only by the apps/agents-api vercel.json installCommand. The postinstall requires BOTH VERCEL=1 AND INKEEP_QUICKSTART_EXPORT=1 before running. Files: - scripts/postinstall.ts: require INKEEP_QUICKSTART_EXPORT=1 gate - apps/agents-api/vercel.json: prepend INKEEP_QUICKSTART_EXPORT=1 to existing installCommand Backward-compat impact: existing Vercel projects customers created from prior template versions will skip the export on their next deploy until they update their vercel.json (or re-scaffold from the template). They have two paths: - Pull in this template change (agents-api/vercel.json now carries the var) - Manually add INKEEP_QUICKSTART_EXPORT=1 to their installCommand Both recover cleanly. Existing apps/manage-ui exports from prior deploys remain on disk; the only loss is the refresh on the next deploy, which is repaired by either of the above actions. GitOrigin-RevId: a5fc5646e7fb2ae2c6f81cdc34453ed89d8191f9 * chore(ci): clean up stale monorepo-migration artifacts (inkeep#157) * chore(ci): clean up stale monorepo-migration artifacts - Delete public/agents/.github/workflows/coverage.yml.disabled. The workflow references package paths (execution-api, management-api, agent-builder) that no longer exist post-monorepo-migration, so it couldn't be re-enabled as-is. Also delete the companion doc public/agents/agents-docs/CI_SETUP.md, which was framed entirely around this workflow. Drop the stale row in public/agents/.agents/skills/internal-surface-areas/SKILL.md. - Add *.disabled to the copybara manifest exclude list so any future dead workflow files don't silently leak into the public mirror. Regenerated public-agents.bara.sky from the manifest. - Remove the dead chat-to-edit branch trigger from both public/agents/.github/workflows/release.yml and .github/workflows/public-agents-version-packages.yml. The branch returns 404 on both inkeep/agents-private and inkeep/agents. Out of scope (investigated, confirmed load-bearing): the agents-cli/** and agents-api/** path filters on public release.yml stay - both exist as top-level dirs on the public inkeep/agents repo (hoisted layout) and removing them would skip npm publishes for CLI/API-only changes. * Revert chat-to-edit branch removal (load-bearing) chat-to-edit branch on public inkeep/agents drives a dedicated snapshot publish step at public/agents/.github/workflows/release.yml lines 109-117 (pnpm changeset version --snapshot chat-to-edit). The branch returning 404 on both repos just means nobody has pushed to it right now; it's created on-demand for dev snapshot publishes. Keep the trigger in both release.yml and public-agents-version-packages.yml. GitOrigin-RevId: 5de85ee7359c4033fa2d3b7cd69910c57e1070ac * fix(preview): wire SigNoz to PR preview environments (inkeep#158) * fix(preview): wire SigNoz to PR preview environments Preview PR deployments had broken tracing: the manage-ui /api/traces proxy dropped the bypass secret so SigNoz health checks 401'd, and neither the Vercel projects nor agents-api had SIGNOZ_URL / otel exporter config. Adds a shared-observability Railway stack (signoz + otel-collector + clickhouse) and injects the endpoints + JWT-format API key at preview env provisioning time. - upsert-vercel-preview-env.sh: accept PREVIEW_SIGNOZ_URL / PREVIEW_SIGNOZ_API_KEY / PREVIEW_SIGNOZ_INGESTION_KEY / PREVIEW_OTEL_EXPORTER_OTLP_TRACES_ENDPOINT and upsert into both Vercel projects when set. - public-agents-preview-environments.yml: forward the new secrets into the upsert step. - agents-api signoz.ts: support JWT-format SIGNOZ_API_KEY via Authorization: Bearer (the shared preview instance exposes a rotating refresh token, not a PAT). Production PATs still go through the SIGNOZ-API-KEY header. - agents-manage-ui /api/traces: forward the manage-api bypass secret so SigNoz proxy calls authenticate through to agents-api. * fix(preview): export OTLP traces without requiring ingestion key Self-hosted SigNoz collector accepts unauthenticated OTLP, so gating OTEL_EXPORTER_OTLP_TRACES_ENDPOINT on PREVIEW_SIGNOZ_INGESTION_KEY was skipping trace export entirely. Split the guards, also wire OTLP env vars on the manage-ui project (which emits traces too) and add OTEL_RESOURCE_ATTRIBUTES so preview spans are filterable by PR branch. * refactor: address PR review polish items - signoz.ts: document why the 'eyJ' prefix safely discriminates JWT refresh tokens from SigNoz PATs (preview-stack workaround for broken v0.119 enterprise PAT endpoint). - traces/route.ts GET handler: reuse extractRequestContext instead of duplicating cookie + bypass-secret header construction. GitOrigin-RevId: 942b6298a47afca1c49c3ae06fb10bce04f4f27f * Move entitlement lock query to DAL layer (inkeep#113) * Move entitlement lock query to DAL, remove /auth/ exclusion from boundary check * style: auto-format with biome --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> GitOrigin-RevId: 6afc2828e2b2bd136280daece631469d44fafe39 * feat(apps): add quick actions editor to support_copilot config (inkeep#164) * feat(apps): add quick actions editor to support_copilot config Adds quickActions schema to agents-core, persists the field through the apps PATCH route, and introduces a drag-and-drop quick actions section in the support_copilot app form. * style: auto-format with biome * fix(apps): address review feedback on support_copilot quick actions - Add max length constraints (label: 100, prompt: 4000, group: 100) in both the agents-core schema and the manage-ui form validation. - Disable the "Add group" button when the last group has no name, so users can't stack empty groups. - Add screen-reader announcements to both the group-level and action-level DndContexts. - Fix pre-existing a11y lint errors (noLabelWithoutControl, noStaticElementInteractions) by switching the action popover to FormLabel and moving keyboard shortcuts onto the inputs themselves. * chore: sync public/agents lockfile for @dnd-kit deps The initial commit only regenerated the root pnpm-lock.yaml. Vercel builds from public/agents/ which uses its own lockfile, so the @dnd-kit/* deps added to agents-manage-ui/package.json must be present there too or Vercel fails with ERR_PNPM_OUTDATED_LOCKFILE. * chore(apps): drop unused SupportCopilotQuickActionFormInput type Only the group-level type is consumed (app-update-form.tsx). The single-action form type was exported but never imported, flagged by knip in CI. * test(api): regenerate openapi snapshot for SupportCopilotQuickAction schemas Adds the new SupportCopilotQuickAction and SupportCopilotQuickActionGroup schema entries (with min/max length constraints) and the quickActions field on SupportCopilotConfig that were introduced for the quick-actions editor in agents-manage-ui. --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> GitOrigin-RevId: 32bce4fd36e8dd4ae12a8ab2ac3607ea29fdd2f1 * Agent conversation self-reference ({{$conversation.id}}) (inkeep#141) * spec: agent conversation self-reference Add SPEC.md with β-pure design (template-variable-only) for exposing {{$conversation.id}} as a prompt template variable. Includes evidence files (render-site inventory, A2A propagation trace, prior-art analysis, downstream-surfaces analysis, template-engine details), audit + design-challenge findings, and full changelog capturing the β→α→β-pure pivot history with rationale. Implementation follows in subsequent commits per tmp/ship/spec.json (US-001 through US-009). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * [US-001] Add TemplateEngine.renderPrompt() with PromptRenderOptions New static method resolves $-prefixed paths from runtime-provided builtins via dotted-path walk, falling through to the existing built-in dispatch on miss. render() signature and behavior are preserved byte-exact — only renderPrompt() accepts runtimeBuiltins, structurally enforcing the scope invariant that non-prompt render sites cannot opt into runtime builtins. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * [US-002] Wire {{$conversation.id}} resolution in buildSystemPrompt Switch both agent-prompt render sites in buildSystemPrompt from TemplateEngine.render() to TemplateEngine.renderPrompt() and narrow the contextConfigId guard so agents without contextConfig and without {{$conversation.*}} in their prompt see zero behavior change (G4). - Extend runtimeContext type literal to include metadata.taskId - Compute runtimeBuiltins from runtimeContext.metadata.conversationId, treating the 'default' sentinel and undefined as empty - Narrow !resolvedContext path: only render when prompt contains {{$conversation., with preserveUnresolved=true so non-$conversation literals are preserved - Add 6 integration tests covering contextConfig present/absent, G4 zero-drift regression, and 'default'/undefined sentinel cases Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * [US-003] A2A delegation propagation test — child resolves {{$conversation.id}} to parent's conversationId Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * [US-004] Negative-path tests — {{$conversation.id}} does not resolve at non-prompt render sites (D6) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * [US-005] Recognize {{\$conversation.id}} in Monaco prompt editor lint + autocomplete Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * [US-006] Preserve \$-prefixed template variables through pull-v4 round-trip Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * [US-007] Customer-support cookbook demonstrates {{\$conversation.id}} in Zendesk write-back Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * [US-008] Docs: central template-variable reference page Adds public/agents/agents-docs/content/typescript-sdk/prompt-template-variables.mdx documenting all four prompt template variables: contextVariable.*, headers.*, $env.*, and $conversation.id. Registers the new page in typescript-sdk/meta.json navigation. Per SPEC §6 (Must) and §13 Deployment considerations — first centralization of this vocabulary. Promotes $env.* (currently undocumented) alongside the new $conversation.id variable so D3's "mirrors $env.* convention" rationale is grounded in published docs rather than invisible convention. Spec: public/agents/specs/2026-04-16-agent-conversation-self-reference/SPEC.md Co-Authored-By: Claude Opus 4.6 <[email protected]> * [US-009] Changesets: minor bumps for {{\$conversation.id}} Adds four changesets per SPEC §13 / §16 SCOPE: - agents-core: new TemplateEngine.renderPrompt() + PromptRenderOptions - agents-api: resolves {{\$conversation.id}} in agent prompts; A2A propagation - agents-cli: preserves \$-prefixed template variables through pull-v4 round-trip - agents-manage-ui: Monaco prompt editor lint + autocomplete recognition All four bump minor. Changeset config version-locks @inkeep/* packages together via "fixed" rule. agents-docs and cookbook-templates are in the ignore list; no changeset needed for them. Spec: public/agents/specs/2026-04-16-agent-conversation-self-reference/SPEC.md Co-Authored-By: Claude Opus 4.6 <[email protected]> * docs: trim prompt-template-variables.mdx to customer-facing variables only Drops the {{\$env.*}} section from the new reference page and shortens the overall page from 150 → ~65 lines. Page now covers three variables: contextVariable.*, headers.*, and \$conversation.*, linking out to the existing context-fetchers.mdx and headers.mdx pages for the first two and fully documenting {{\$conversation.id}} inline. Rationale — \$env.* has no traceable spec, PRD, or introducing PR: it shipped in the root commit of inkeep/agents (c39fdd0, 2025-09-05, initial squash) as pre-public scaffolding and was preserved without explanation by inkeep#818 (2025-10-24) when the other builtins were removed. Zero cookbook templates use it; zero customer-facing docs ever mentioned it. The draft docs page would have been the first act of promoting it to a customer feature — without the security/privacy review that warrants. Rendered env values are visible to the LLM and can flow through to tool arguments, assistant output, and traces. Keep the undocumented-but-functional status quo; address \$env.* product surface in a separate spec if warranted. Spec updates: - §6 Functional — "Must: docs page" rewritten: three variables, justification restated as single discovery surface + primary home for {{\$conversation.id}} - §9 Proposed solution — Docs bullet rewritten to match - §10 Decision log — adds D8 (LOCKED): intentional omission of \$env.* with full git-archaeology rationale - §16 SCOPE — agents-docs entry names exact file path and meta.json nav step - meta/_changelog.md — appends post-approval pivot entry explaining the trigger (PR inkeep#141 review), evidence, decision, and surgical edits D3 remains LOCKED on its own merits — dollar-prefix still prevents syntactic collision with user-defined contextVariable names regardless of whether \$env.* is a documented convention. Co-Authored-By: Claude Opus 4.6 <[email protected]> --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]> GitOrigin-RevId: 52d0831b4f6e0c7790597c979f1711a5b1a4cd9c * Version Packages (agents) (inkeep#138) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> GitOrigin-RevId: 967594b0c065f561e566bc905512e45385cf7080 * fix: approve sync PRs from source side, drop target-repo auto-approve (inkeep#171) * fix: approve sync PRs from source side, drop target-repo auto-approve Moves copybara sync PR approval into public-mirror-sync.yml on agents-private using the INTERNAL_CI_APP token. Deletes the target-repo auto-approve-sync.yml workflow it replaces. Background PR inkeep#150 introduced auto-approve-sync.yml on the public tree to satisfy the required-approval rule that blocks copybara sync PRs on inkeep/agents. That design has a bootstrap problem: pull_request_target evaluates workflows from the target branch, so the very PR that first mirrors auto-approve-sync.yml to inkeep/agents/main cannot auto-approve itself. inkeep/agents PR inkeep#3157 is stuck in exactly that state, and because copybara/sync is blocked by that stuck PR, subsequent mirror runs on agents-private are failing at "Run agents mirror". Fix Approve from the source side instead. After public-mirror-sync.yml creates or updates the sync PR and enables auto-merge, a new step mints a second app token from INTERNAL_CI_APP (a different identity from inkeep-oss-sync, since GitHub blocks self-approval) and posts an approving review. The approval is scoped to the current head SHA and re-runs on every sync, which satisfies require_last_push_approval=true. - .github/workflows/public-mirror-sync.yml Adds Generate approver token step and Approve sync PR step. Skips re-approval when an approval already stands on the current head SHA. - public/agents/.github/workflows/auto-approve-sync.yml Deleted; replaced by the source-side step above. Prereq to verify once: INTERNAL_CI_APP is installed on inkeep/agents with Pull requests: Read and write. The app already powers Version Packages and release-handler paths that write to inkeep/agents, so the install is almost certainly already in place. * docs: update CI.md and runbook to reflect source-side sync approval Review feedback on inkeep#171: CI.md and CI_RUNBOOK.md still pointed at the deleted public/agents/.github/workflows/auto-approve-sync.yml. - CI.md: drop the row for auto-approve-sync.yml in the public workflows table; in the public/private parity table, replace the old row with one explaining approval now comes from public-mirror-sync.yml on the source side; update the OSS_SYNC_APP secrets row to drop "auto-approve sync" and add Copybara sync PR approval to INTERNAL_CI_APP usage. - CI_RUNBOOK.md: rewrite the "Sync PR on public repo is stuck open" entry to point operators at the Approve sync PR step and INTERNAL_CI_APP app installation instead of the deleted workflow. GitOrigin-RevId: 51edfa6a3a1e0863ba99d8498efad88c5dec1f06 --------- Co-authored-by: Varun Varahabhotla <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: omar-inkeep <[email protected]> Co-authored-by: tim-inkeep <[email protected]> Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]> Co-authored-by: inkeep-internal-ci[bot] <259778081+inkeep-internal-ci[bot]@users.noreply.github.com>
No description provided.