Skip to content

updated slighlty#2556

Merged
tim-inkeep merged 4 commits intomainfrom
bugfix/github-workflow-prs-minor
Mar 6, 2026
Merged

updated slighlty#2556
tim-inkeep merged 4 commits intomainfrom
bugfix/github-workflow-prs-minor

Conversation

@tim-inkeep
Copy link
Copy Markdown
Contributor

No description provided.

@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

3 Skipped Deployments
Project Deployment Actions Updated (UTC)
agents-api Ignored Ignored Preview Mar 6, 2026 0:57am
agents-docs Ignored Ignored Preview Mar 6, 2026 0:57am
agents-manage-ui Ignored Ignored Preview Mar 6, 2026 0:57am

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Mar 5, 2026

⚠️ No Changeset found

Latest commit: 43d3d58

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog Bot commented Mar 5, 2026

Hardens the model-sync workflow's gateway fetch logic and fixes a stale description.

  • .github/scripts/detect-model-changes.ts: Add explicit AbortError handling in fetchWithTimeout to surface timeout errors with a clear message; validate that json.data is an array before processing instead of silently falling back to []
  • .github/workflows/model-sync.yml: Correct the days input description to match the actual default (2, not 90); add a clarifying comment that AI_GATEWAY_API_KEY is optional

Pullfrog  | View workflow run | Using Claude Code | Triggered by Pullfrogpullfrog.com𝕏

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, low-risk improvements. The AbortError catch in fetchWithTimeout correctly re-throws non-abort errors and preserves the finally cleanup. The Array.isArray guard in fetchGatewayModels makes the ?? [] fallback redundant — good to remove it. Workflow description fix aligns with the actual default.

Pullfrog  | View workflow runpullfrog.com𝕏

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

This PR makes well-scoped improvements to internal CI/CD infrastructure:

Changes Reviewed

.github/scripts/detect-model-changes.ts

  1. Improved timeout error handling (lines 46-50): Converts generic AbortError to a descriptive message including the URL and timeout duration — makes debugging easier when the Vercel AI Gateway is slow/unreachable.

  2. Response format validation (lines 67-69): Validates json.data is an array before processing. This replaces the previous json.data ?? [] pattern that would silently treat malformed responses as "no models found." Now fails fast with a clear error.

  3. Code cleanup (line 72): Removed redundant nullish coalescing since validation now guarantees the array exists.

.github/workflows/model-sync.yml

  1. Documentation fix (line 9): Corrected description from "default: 90" to "default: 2" to match actual default.

  2. Clarifying comment (line 49): Added note that AI_GATEWAY_API_KEY is optional.

Security & Best Practices ✓

  • All GitHub Actions pinned to full SHAs
  • Appropriate permissions (contents: write, pull-requests: write)
  • Secrets properly passed via env:, not echoed
  • No supply chain or secret exposure concerns

✅ APPROVE

Summary: Clean improvements to error handling and documentation in internal CI tooling. The changes convert potential silent failure modes into explicit, actionable errors. No issues found by any reviewer. 🎉

Reviewers (3)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 0 0 0 0 0 0 0
pr-review-devops 0 0 0 0 0 0 0
pr-review-errors 0 0 0 0 0 0 0
Total 0 0 0 0 0 0 0

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Delta Review (1 commit since last review)

New change: Added filtering guidance to the Claude Code prompt (line 191):

- Only add models consistent with the existing list — skip specialized variants (realtime, audio, embedding, search-augmented, instruct, fine-tuning base models)

This is a sensible improvement to the model-sync workflow's instructions. It explicitly tells the Claude Code agent to exclude specialized model variants that shouldn't appear in the UI dropdowns — preventing clutter from non-chat models being added to the model selection lists.

The guidance aligns with the existing behavior in model-options.tsx where only general-purpose chat models are listed, and makes the prompt instructions more explicit to reduce false positives.


✅ APPROVE

Summary: Clean prompt improvement. The delta adds explicit filtering guidance to prevent specialized model variants from being added to the codebase — a sensible clarification that makes the automated model sync more accurate. Approval from prior review stands. 🚀

@github-actions github-actions Bot deleted a comment from claude Bot Mar 5, 2026
@itoqa
Copy link
Copy Markdown

itoqa Bot commented Mar 6, 2026

Ito Test Report ❌

14 test cases ran. 13 passed, 1 failed.

✅ Most verified behaviors passed, including DAYS parsing, timeout/error handling, malformed-payload validation, duplicate-check fail-safe behavior, and authorization-boundary checks. 🔍 One confirmed application bug remains: untrusted model IDs are inserted into the generated automation prompt without sanitization, allowing markdown/control-sequence prompt framing injection.

✅ Passed (13)
Test Case Summary Timestamp Screenshot
ROUTE-1 Direct workflow dispatch was blocked (HTTP 403), but local fallback confirmed default lookback behavior of last 2 days. 3:22 ROUTE-1_3-22.png
ROUTE-2 Direct workflow dispatch was blocked (HTTP 403), but local fallback confirmed custom DAYS=7 behavior. 3:23 ROUTE-2_3-23.png
LOGIC-1 Timeout failures are rewritten to an explicit endpoint-specific timeout message with 30000ms. 16:01 LOGIC-1_16-01.png
LOGIC-2 Non-timeout fetch failures preserve the original network/DNS error instead of being masked. 16:01 LOGIC-2_16-01.png
LOGIC-3 Invalid gateway payload shape fails with explicit "data is not an array" error. 16:01 LOGIC-3_16-01.png
LOGIC-4 Generated prompt contains explicit instruction to exclude specialized variants. 16:01 LOGIC-4_16-01.png
EDGE-1 DAYS=0 correctly falls back to default 2-day lookback without crashing. 18:58 EDGE-1_18-58.png
EDGE-2 Non-numeric DAYS input correctly falls back to default 2-day lookback. 18:58 EDGE-2_18-58.png
EDGE-4 Malformed model IDs without valid provider/id format are filtered out. 16:01 EDGE-4_16-01.png
EDGE-5 Models with invalid released timestamps are excluded from candidates. 16:01 EDGE-5_16-01.png
ADV-2 Abusive DAYS values were handled deterministically with no runtime crashes. 18:59 ADV-2_18-59.png
ADV-4 Duplicate-check API failures trigger warning and safe short-circuit behavior. 16:01 ADV-4_16-01.png
ADV-5 Low-privilege/unauthenticated context correctly cannot dispatch workflows or access full logs. 23:45 ADV-5_23-45.png
❌ Failed (1)
Test Case Summary Timestamp Screenshot
ADV-3 Malicious model IDs are interpolated into multiline prompt content without escaping, breaking prompt framing integrity. 16:01 ADV-3_16-01.png
Prompt/output injection via model ID is neutralized in workflow artifacts – Failed
  • Where: Model Sync detection script prompt generation (detect-model-changes.ts) before handing prompt text to downstream automation.

  • Steps to reproduce: Return a gateway model ID containing markdown/newlines/control text; run detect script; inspect emitted prompt output.

  • What failed: The raw model ID is inserted directly into the prompt body and summary, so injected content breaks list/code-block framing instead of being safely escaped.

  • Code analysis: Reviewed .github/scripts/detect-model-changes.ts filtering and prompt construction plus workflow wiring in .github/workflows/model-sync.yml. The code validates provider/type/release but never sanitizes model IDs before concatenating into multiline markdown prompt sections and fenced-block content.

  • Relevant code:

    .github/scripts/detect-model-changes.ts (lines 72-87)

    return json.data
      .filter((m) => {
        const slashIndex = m.id.indexOf('/');
        if (slashIndex === -1) return false;
        if (!TRACKED_PROVIDERS.has(m.id.slice(0, slashIndex))) return false;
        if (m.type !== 'language') return false;
        if (typeof m.released !== 'number' || m.released < cutoff) return false;
        return true;
      })
      .map((m) => {
        const slashIndex = m.id.indexOf('/');
        const provider = m.id.slice(0, slashIndex);
        const rawId = m.id.slice(slashIndex + 1);
        const id = provider === 'anthropic' ? normalizeAnthropicId(rawId) : rawId;
        return { provider, id };
      });

    .github/scripts/detect-model-changes.ts (lines 173-181)

    const modelList = newModels.map((m) => `- ${m.provider}/${m.id}`).join('\n');
    const modelSummary = newModels.map((m) => `${m.provider}/${m.id}`).join(', ');
    const today = new Date().toISOString().split('T')[0];
    
    const prompt = `New AI models have been detected from Vercel AI Gateway that are not yet in our static model list.
    
    New models to add:
    ${modelList}
    
    Please update the following 3 files to include these models:

    .github/workflows/model-sync.yml (lines 52-59)

    - name: Update model list via Claude Code
      if: steps.detect.outputs.has_changes == 'true'
      uses: anthropics/claude-code-action@01e756b34ef7a1447e9508f674143b07d20c2631 # v1
      with:
        anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
        github_token: ${{ steps.app-token.outputs.token }}
        claude_args: '--model opus'
        prompt: ${{ steps.detect.outputs.prompt }}
  • Why this is likely a bug: Untrusted external model identifiers are embedded into structured markdown instructions without escaping, allowing malformed or injected IDs to alter prompt structure and downstream automation intent.

  • Introduced by this PR: No – pre-existing bug (code not changed in this PR).

  • Timestamp: 16:01

📋 View Recording

Screen Recording

@tim-inkeep tim-inkeep added this pull request to the merge queue Mar 6, 2026
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Delta Review (since commit 5e0249344745)

Changes since last review:

  1. show_full_output: 'true' (line 60 in model-sync.yml) — Enables full output display in claude-code-action logs for better debugging visibility. No behavioral impact on the model sync itself.

  2. Merge from main — Incorporates changes from PR #2521 (OpenAPI auto-snapshot CI). These changes were already reviewed and approved in that PR.

Assessment

The delta is minimal and low-risk:

  • Single configuration flag for debugging/observability
  • No code logic changes
  • No security implications
  • No architectural decisions
  • Merged changes from main were separately reviewed

Prior reviews (2x ✅ APPROVE) covered the substantive changes in this PR:

  • AbortError handling improvements
  • Array.isArray validation
  • Documentation fixes
  • Prompt filtering guidance

✅ APPROVE

Summary: Trivial delta — one debugging flag + merge from main. All substantive changes were already approved in prior reviews. Ship it! 🚀

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
None dispatched

Note: Delta was too minimal to warrant subagent dispatch (1 line config change + main merge).

@github-actions github-actions Bot deleted a comment from claude Bot Mar 6, 2026
Merged via the queue into main with commit afc665e Mar 6, 2026
10 of 11 checks passed
@tim-inkeep tim-inkeep deleted the bugfix/github-workflow-prs-minor branch March 6, 2026 01:02
@itoqa
Copy link
Copy Markdown

itoqa Bot commented Mar 6, 2026

Ito Test Report ✅

20 test cases ran. 20 passed.

🔍 Verification reviewed workflow dispatch behavior, input hardening, idempotency safeguards, deterministic error handling, model filtering, and log-safety outcomes. All executed cases met expected behavior, and no code-backed production defect was identified.

✅ Passed (20)
Test Case Summary Timestamp Screenshot
ROUTE-1 Verified default days=2 from workflow_dispatch config and confirmed executed runs used DAYS=2 in detect logs. 0:00 ROUTE-1_0-00.png
ROUTE-2 Direct dispatch with days=7 was denied by token scope, and fallback evidence from a manual run confirmed custom input propagation with DAYS=90 in Detect model changes. 0:00 ROUTE-2_0-00.png
ROUTE-3 Confirmed no-change branch behavior from run metadata where Detect model changes succeeded and the Claude update step did not execute. 0:00 ROUTE-3_0-00.png
ROUTE-4 Verified run metadata showed the Claude update step executing after detected changes, and workflow logs included show_full_output=true. 0:00 ROUTE-4_0-00.png
EDGE-1 Local harness confirmed parse logic maps DAYS=abc to fallback value 2 without error. 1:11 EDGE-1_1-11.png
EDGE-2 Local harness verified DAYS=0 and DAYS=-5 both resolve to fallback value 2 and remain stable. 1:11 EDGE-2_1-11.png
ADV-1 Local harness confirmed injection-like payload is treated as input text and parsed as days=2 with no execution side effects. 1:11 ADV-1_1-11.png
ADV-6 Local harness validated oversized days value parsing and downstream cutoff arithmetic complete immediately and deterministically. 1:11 ADV-6_1-11.png
ADV-3 Code-level safeguards were verified: serialized workflow concurrency and duplicate-PR fail-safe checks protect rapid-submit behavior. 1:11 ADV-3_1-11.png
ADV-2 Rapid repeated trigger interaction produced a single actionable dialog state, indicating idempotent handling of repeated submit-like actions. 6:47 ADV-2_6-47.png
EDGE-6 Refresh and history navigation completed successfully and returned to a stable page state without fatal rendering or interaction failures. 6:47 EDGE-6_6-47.png
EDGE-5 At 390x844 viewport, key navigation controls remained usable and project creation flow could be launched, confirming mobile interaction viability. 6:47 EDGE-5_6-47.png
LOGIC-1 Forced abort path produced deterministic timeout message with 30000ms and non-zero failure semantics. 10:22 LOGIC-1_10-22.png
LOGIC-2 Malformed payload path failed loudly with explicit format error and deterministic failure output. 10:22 LOGIC-2_10-22.png
LOGIC-3 Filtering retained only allowlisted language models in provider/model format and dropped invalid fixture entries. 10:22 LOGIC-3_10-22.png
LOGIC-4 Generated prompt explicitly instructs skipping realtime/audio/embedding/search-augmented/instruct/fine-tuning variants. 10:22 LOGIC-4_10-22.png
ADV-4 Adversarial fixture containing spoofed providers was filtered prior to prompt generation, leaving only valid provider entries. 10:22 ADV-4_10-22.png
EDGE-4 Forced GitHub search failure produced warning and conservative duplicate-prevention behavior. 10:22 EDGE-4_10-22.png
EDGE-3 Unauthenticated gateway execution remained diagnosable with explicit output and captured response artifacts. 10:22 EDGE-3_10-22.png
ADV-5 Captured and reviewed full GitHub Actions logs for a has_changes=true run; Claude step output was visible and sensitive values were masked, with targeted secret-pattern scan returning no actionable credential matches. 12:30 ADV-5_12-30.png
📋 View Recording

Screen Recording

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant