Skip to content

fix loading .svg files as React components in tests #2472

Merged
dimaMachina merged 43 commits intomainfrom
feat/dockerize-visual-tests-2
Apr 2, 2026
Merged

fix loading .svg files as React components in tests #2472
dimaMachina merged 43 commits intomainfrom
feat/dockerize-visual-tests-2

Conversation

@dimaMachina
Copy link
Copy Markdown
Collaborator

No description provided.

vnv-varun and others added 16 commits February 27, 2026 13:46
…istency

Run Playwright browser inside a Docker container so visual screenshot
tests produce identical results on macOS (local dev) and Linux (CI).

- Add docker-compose.visual.yml with Playwright server container
- Update vitest config to connect to Docker browser via websocket
  when PW_TEST_CONNECT_WS_ENDPOINT env var is set
- Add test:visual and test:visual:update npm scripts
- Update CI workflow to use Docker Playwright server instead of
  bare Playwright install
- Regenerate screenshot baselines from Linux container

Closes PRD-6191

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add explicit failure handling if Playwright server doesn't start
- Bind Docker port to 127.0.0.1 only (don't expose to network)
- Align npx playwright version with Docker image (both 1.58.0)

Co-Authored-By: Claude Opus 4.6 <[email protected]>
The @vitest/browser-playwright package requires a local Playwright
install to initialize, even when the actual browser runs in Docker
via connectOptions. Keep the install step alongside the Docker server.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
… mode

Turbo v2 strict mode filters env vars not listed in turbo.json from
child processes. The Playwright WebSocket endpoint was being silently
dropped, causing vitest to fall back to local Chromium instead of the
Docker server — producing mismatched screenshots in CI.

Also pins docker-compose.visual.yml to linux/amd64 so local baselines
match CI regardless of host architecture (see
microsoft/playwright#13873), and fixes the
Playwright cache restore-keys prefix mismatch.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…ests

The previous refactor removed the onUnhandledError handler, causing CI
to fail with exit code 1 due to 3 known, unfixable Monaco Editor errors
in Vitest browser mode:

1. "Cannot use import statement outside a module" - Monaco web workers
   cannot load ESM in the Vitest browser sandbox
2. "InvalidCharacterError" / "is not a valid name" - Monaco attempts
   createElement with an SVG data URI as the tag name
3. "Closing rpc while" - Vitest worker RPC shutdown race condition

These errors were originally suppressed by Nick in #2046 and #2078 after
investigation confirmed they are unfixable Monaco/Vitest internals that
do not affect test correctness.

Refs: #2046, #2078
The pixelmatch package is no longer imported after the vitest.config.ts
refactor removed the custom tolerantPixelmatch comparator. Knip
correctly flags it as unused.
@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Apr 2, 2026 11:44am
agents-manage-ui Ready Ready Preview, Comment Apr 2, 2026 11:44am
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
agents-docs Skipped Skipped Apr 2, 2026 11:44am

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Mar 2, 2026

🦋 Changeset detected

Latest commit: 465cbb8

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages
Name Type
@inkeep/agents-manage-ui Patch
@inkeep/agents-api Patch
@inkeep/agents-cli Patch
@inkeep/agents-core Patch
@inkeep/agents-email Patch
@inkeep/agents-mcp Patch
@inkeep/agents-sdk Patch
@inkeep/agents-work-apps Patch
@inkeep/ai-sdk-provider Patch
@inkeep/create-agents Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@itoqa
Copy link
Copy Markdown

itoqa Bot commented Mar 2, 2026

Ito Test Report ✅

17 test cases ran. 17 passed.

This verification run tested the agents-manage-ui visual test infrastructure changes from PR #2472, focusing on Dockerized visual regression tests with the vite-plugin-svgr migration. All verified functionality passed successfully: Docker image builds with the updated Dockerfile and agents-core dependency, SVG icons render correctly using the new ?react suffix via vite-plugin-svgr, Monaco worker mocks initialize properly in the browser test environment, non-browser tests remain unaffected, and the error handling changes (onUnhandledError removal) did not cause regressions. The infrastructure correctly detects missing Docker Playwright servers with clear error messages and handles font loading failures gracefully.

✅ Passed (17)
Test Case Summary Timestamp Screenshot
ROUTE-1 Docker build with Dockerfile.manage-ui-test completed successfully. All COPY steps including packages/agents-core/package.json succeeded. pnpm install --frozen-lockfile resolved 1950 packages. Image tagged as agents-ui-test:local (1.26GB). 2:23 ROUTE-1_2-23.png
LOGIC-1 All 9 SVG exports in icons/index.ts use ?react suffix. env.d.ts declares module *.svg?react with FC<SVGProps>. No remaining ?svgr references found. 10:56 LOGIC-1_10-56.png
LOGIC-2 vi.mock at lines 24-36 sets globalThis.MonacoEnvironment with getWorker function. EditorWorker and JsonWorker imported via Vite ?worker syntax. Switch/case on label: 'json' returns new JsonWorker(), default returns new EditorWorker(). 11:16 LOGIC-2_11-16.png
LOGIC-3 All 21 non-browser test files (214 tests) pass successfully. vite-plugin-svgr plugin addition and onUnhandledError removal caused no regressions in JSDOM-based tests. 12:33 LOGIC-3_12-33.png
LOGIC-4 No act() warnings found in either test run. React state cleanup via act() wrapper is working correctly with no timing-related warnings or errors. 8:25 LOGIC-4_8-25.png
EDGE-1 Docker container visual tests confirm SVG rendering parity: nodes test passed with screenshots matching @svgr/webpack baselines. Form error state test also passed. vite-plugin-svgr produces identical rendering to @svgr/webpack baseline screenshots. 14:47 EDGE-1_14-47.png
EDGE-2 Searched test output for 'Closing rpc while', 'Cannot use import statement outside a module', 'is not a valid name', 'InvalidCharacterError' - none found. The onUnhandledError removal did not cause previously-suppressed errors to resurface as fatal issues. 8:24 EDGE-2_8-24.png
EDGE-3 Dockerfile.manage-ui-test line 26 copies packages/agents-core/package.json. Build step 7/9 completed without errors. The agents-core package.json exists with valid JSON (@inkeep/agents-core v0.53.12). 2:26 EDGE-3_2-26.png
EDGE-4 Verified CMD pattern .browser.test. in Dockerfile matches only 2 intended files (form.browser.test.tsx and nodes.browser.test.tsx). No unintended .browser.test.ts, .browser.test.js, or .browser.test.jsx files exist. 15:50 EDGE-4_15-50.png
EDGE-5 Mock getWorker in form.browser.test.tsx uses switch/case: 'json' returns JsonWorker, default returns EditorWorker. This correctly mirrors production setup-monaco-workers.ts fallback pattern. 11:30 EDGE-5_11-30.png
EDGE-6 All 3 screenshot baselines are present and non-empty in agents-manage-ui/src/screenshots/: Form error-state 37KB, nested-error-state 11KB, nodes-long-names 74KB. Each browser test has a corresponding baseline image. 5:10 EDGE-6_5-10.png
EDGE-7 Second test run produced identical results to first run - same behavior with exact same pixel differences. No flakiness detected. Failures are deterministic screenshot mismatches, not timing-related act() issues. 9:07 EDGE-7_9-07.png
ADV-1 Verified that running visual tests with PW_TEST_CONNECT_WS_ENDPOINT=ws://127.0.0.1:3100 when no Playwright server is running produces a clear ECONNREFUSED error. No silent fallback to local Chromium occurred. 17:09 ADV-1_17-09.png
ADV-2 Test environment is x86_64 (verified via uname -m and docker info). Docker compose pins platform: linux/amd64 as expected. Screenshot mismatches are present even on x86_64 due to font rendering environment differences. 9:24 ADV-2_9-24.png
ADV-3 Blocked raw.githubusercontent.com via /etc/hosts to prevent TestFont loading. Ran visual browser tests with blocked font. All tests completed in ~7s with screenshot mismatch failures but did NOT hang indefinitely. The 15s screenshot timeout works correctly. 19:44 ADV-3_19-44.png
ADV-4 Monaco worker mock logs show both json and editorWorkerService workers initialized correctly in both test runs without conflicts. No duplicate worker errors, no global state issues. Monaco editor element found in DOM. 9:44 ADV-4_9-44.png
ADV-5 pnpm-lock.yaml contains [email protected] with resolution and integrity hash. pnpm install --frozen-lockfile succeeded in Docker build (1950 packages), confirming lockfile is consistent with package.json dependencies. 2:27 ADV-5_2-27.png
📋 View Recording

Screen Recording

@github-actions github-actions Bot deleted the branch main March 17, 2026 00:35
@github-actions github-actions Bot closed this Mar 17, 2026
@dimaMachina dimaMachina reopened this Mar 17, 2026
Base automatically changed from feat/dockerize-visual-tests to main April 1, 2026 16:18
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(1) Total Issues | Risk: Medium

Delta Review Scope

This is a re-review scoped to changes since the previous automated review at commit 834084b3a.

The delta consists of 1 commit (76a5ec2f2 — "polish"):

File Change Assessment
AGENTS.md Changed code fence from ``` to ```dotenv ✅ Correct — syntax highlighting improvement
docker-compose.visual.yml Typo fix: etc).etc.). ✅ Correct — punctuation fix
docker-compose.visual.yml Changed platform: linux/amd64platform: linux Regression — breaks cross-architecture screenshot consistency

🟠⚠️ Major (1) 🟠⚠️

Inline Comments:

  • 🟠 Major: docker-compose.visual.yml:11 Platform change breaks cross-architecture screenshot consistency

Analysis

The change from platform: linux/amd64 to platform: linux is a regression that undermines the core purpose of this visual testing infrastructure. The prior review feedback specifically requested restoring the documentation comment explaining why linux/amd64 is required — this delta instead removes the architecture pin entirely.

Impact:

  • On Apple Silicon Macs, platform: linux will use linux/arm64
  • ARM Chromium produces different subpixel rendering than x86 Chromium
  • Local screenshots will no longer match CI screenshots
  • The AGENTS.md documentation explicitly states Docker is required to ensure screenshots match CI

This contradicts the Ito test report's ADV-2 finding: "Screenshot mismatches are present even on x86_64 due to font rendering environment differences" — if there are already differences on the same architecture due to fonts, removing the architecture pin will make this significantly worse.


🚫 REQUEST CHANGES

Summary: The platform: linux/amd64platform: linux change in docker-compose.visual.yml is a regression that breaks the cross-platform screenshot consistency this PR was designed to achieve. Please restore platform: linux/amd64 along with the explanatory comment (see 1-click suggestion in the inline comment). The AGENTS.md change (adding dotenv language identifier) is fine.

Discarded (1)
Location Issue Reason Discarded
AGENTS.md:265 Code fence language change Valid improvement — dotenv syntax highlighting is more accurate than generic
Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
Total 0 0 0 0 1 0 1

Note: Delta review — no sub-reviewers dispatched as the delta was limited to 2 files with a single substantive change.

Comment thread docker-compose.visual.yml
# ARM and x86 Chromium produce different subpixel rendering — see
# https://github.com/microsoft/playwright/issues/13873
platform: linux/amd64
platform: linux
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Platform change breaks cross-architecture screenshot consistency

Issue: Changing from platform: linux/amd64 to platform: linux removes the architecture pin that ensures screenshot consistency between local development and CI.

Why: ARM and x86 Chromium produce different subpixel rendering. On Apple Silicon Macs, platform: linux will pull the ARM image, causing local screenshots to differ from CI screenshots (which run on x86 Linux). This defeats the entire purpose of running the browser in a Docker container — the AGENTS.md documentation even states: "Docker is required — without it, tests fall back to local Chromium and screenshots won't match CI."

Fix: (1-click apply)

Suggested change
platform: linux
# Pin to amd64 so screenshots match CI (ubuntu-x86) regardless of host arch.
# ARM and x86 Chromium produce different subpixel rendering — see
# https://github.com/microsoft/playwright/issues/13873
platform: linux/amd64

Refs:

@github-actions github-actions Bot deleted a comment from claude Bot Apr 2, 2026
@vercel vercel Bot temporarily deployed to Preview – agents-docs April 2, 2026 10:55 Inactive
@socket-security
Copy link
Copy Markdown

socket-security Bot commented Apr 2, 2026

All alerts resolved. Learn more about Socket for GitHub.

This PR previously contained dependency changes with security issues that have been resolved, removed, or ignored.

View full report

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Medium

Delta Review Scope

This is a re-review scoped to changes since the previous automated review at commit 76a5ec2f2.

The delta consists of 1 commit (645abc052 — "fix"):

File Change Assessment
__mocks__/monaco-editor.ts:25 Reverted from 'monaco-editor/esm/vs/editor/editor.main.js' back to 'monaco-editor' ✅ Correct — reverting the specific ESM path that may have caused compatibility issues; simpler import works better with the test bundler
package.json:46 Added --project '@inkeep/agents-manage-ui' filter and chained && pnpm test:visual ✅ Correct — CI already starts the Playwright Docker server and sets PW_TEST_CONNECT_WS_ENDPOINT, so visual tests will run correctly

Assessment: Both delta changes are valid fixes. No new issues introduced.

🕐 Pending Recommendations (1)

Prior review feedback that remains unaddressed (unchanged in this delta, but still blocking):

  • 🟠 docker-compose.visual.yml:11 Platform change breaks cross-architecture screenshot consistency — The platform: linux/amd64 was changed to platform: linux in commit 76a5ec2f2. On Apple Silicon Macs, this causes the Playwright container to use ARM instead of x86, producing different subpixel rendering. Local screenshots will no longer match CI screenshots. This was flagged as a MAJOR issue in the previous review and remains unresolved.

🚫 REQUEST CHANGES

Summary: The delta changes in this commit are fine — reverting the Monaco mock to a simpler import and chaining visual tests into the main test script are both valid fixes. However, the blocking issue from the previous review remains unresolved: the platform: linux/amd64platform: linux change in docker-compose.visual.yml will cause screenshot mismatches between local development (ARM on Apple Silicon) and CI (x86 Linux). Please restore platform: linux/amd64 with the explanatory comment as suggested in the inline comment.

Discarded (0)

No discarded findings.

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
Total 0 0 0 0 0 1 0

Note: Delta review — no sub-reviewers dispatched as the delta was limited to 2 files with minor fixes.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Medium

Delta Review Scope

This is a re-review scoped to changes since the previous automated review at commit 645abc052f39.

The delta consists of 3 commits:

Commit Change Assessment
e841d144a Merge branch 'main' into feat/dockerize-visual-tests-2 ✅ Merge commit — brings in main branch updates
c900173e8 run pnpm dedupe (#2953) ✅ Lockfile optimization from main
1b339c076 pnpm i ✅ Lockfile update

Delta Changes Analyzed

File Change Assessment
agents-docs/package.json React version: canary → ^19.2.4 ✅ Stabilization from canary to release
agents-manage-ui/package.json React version: canary → ^19.2.4 ✅ Stabilization from canary to release
packages/agents-email/package.json React peerDep/devDep version normalization ✅ Cleanup
pnpm-lock.yaml Deduplication and version updates ✅ Expected lockfile changes

Assessment: All delta changes are benign — merge commits, lockfile deduplication, and React version stabilization from canary to stable release. No new code changes requiring review.

🕐 Pending Recommendations (1)

Prior review feedback that remains unaddressed and is still blocking:

  • 🟠 docker-compose.visual.yml:11 Platform change breaks cross-architecture screenshot consistency — The platform: linux/amd64 was changed to platform: linux in commit 76a5ec2f2. On Apple Silicon Macs, this causes the Playwright container to use ARM instead of x86, producing different subpixel rendering. Local screenshots will no longer match CI screenshots (which run on x86 Linux). This was flagged as a MAJOR issue in the previous review and remains unresolved.

The fix is straightforward — restore the architecture pin with the explanatory comment:

  playwright:
    # Pin to amd64 so screenshots match CI (ubuntu-x86) regardless of host arch.
    # ARM and x86 Chromium produce different subpixel rendering — see
    # https://github.com/microsoft/playwright/issues/13873
    platform: linux/amd64

🚫 REQUEST CHANGES

Summary: The delta since last review contains only merge commits from main and lockfile updates — no issues in the new code. However, the blocking issue from the previous review remains unresolved: the platform: linux/amd64platform: linux change in docker-compose.visual.yml will cause screenshot mismatches between local development (ARM on Apple Silicon) and CI (x86 Linux). Please restore platform: linux/amd64 with the explanatory comment as suggested in the inline comment. This is a 1-line fix that preserves the core purpose of this visual testing infrastructure.

Discarded (0)

No discarded findings.

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
Total 0 0 0 0 0 1 0

Note: Delta review — no sub-reviewers dispatched as the delta contained only merge commits and lockfile changes with no new code to review.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Medium

Delta Review Scope

This is a re-review scoped to changes since the previous automated review at commit 1b339c076.

The delta consists of 1 commit (465cbb8b9 — "upd"):

File Change Assessment
diff-viewer.test.tsx Added cleanup import and afterEach(cleanup) ✅ Correct — explicit cleanup prevents test pollution
nodes.browser.test.tsx Added cleanup import, afterEach(cleanup), reduced timeout 30s→10s ✅ Correct — cleanup and faster timeouts after worker mocking stabilized
form.browser.test.tsx Added cleanup import/call, reduced timeouts to 10s ✅ Correct — consistent cleanup pattern
generic-prompt-editor.test.tsx Added cleanup import and afterEach(cleanup) ✅ Correct — consistent cleanup pattern
Screenshot baselines (2 files) Deleted ✅ Expected — baselines regenerated with Docker/svgr changes

Assessment: All delta changes are valid test hygiene improvements — adding explicit cleanup() calls ensures component unmounting between tests, and reduced timeouts reflect improved test stability now that Monaco worker mocking is properly configured. The deleted screenshot baselines will be regenerated on next visual test run.

🕐 Pending Recommendations (1)

Prior review feedback that remains unaddressed and is still blocking:

  • 🟠 docker-compose.visual.yml:11 Platform change breaks cross-architecture screenshot consistency — The platform: linux/amd64 was changed to platform: linux in commit 76a5ec2f2. On Apple Silicon Macs, this causes the Playwright container to use ARM instead of x86, producing different subpixel rendering. Local screenshots will no longer match CI screenshots (which run on x86 Linux). This was flagged as a MAJOR issue in the previous review and remains unresolved.

The fix is straightforward — restore the architecture pin with the explanatory comment:

  playwright:
    # Pin to amd64 so screenshots match CI (ubuntu-x86) regardless of host arch.
    # ARM and x86 Chromium produce different subpixel rendering — see
    # https://github.com/microsoft/playwright/issues/13873
    platform: linux/amd64

🚫 REQUEST CHANGES

Summary: The delta since last review contains valid test hygiene improvements — adding explicit cleanup() calls, reducing timeouts, and removing old screenshot baselines. However, the blocking issue from prior reviews remains unresolved: the platform: linux/amd64platform: linux change in docker-compose.visual.yml will cause screenshot mismatches between local development (ARM on Apple Silicon) and CI (x86 Linux). Please restore platform: linux/amd64 with the explanatory comment as suggested in the inline comment. This is a 1-line fix that preserves the core purpose of this visual testing infrastructure.

Discarded (0)

No discarded findings.

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
Total 0 0 0 0 0 1 0

Note: Delta review — no sub-reviewers dispatched as the delta contained only test cleanup changes (adding explicit cleanup() calls, reducing timeouts, deleting old baselines) with no new logic to review.

@github-actions github-actions Bot deleted a comment from claude Bot Apr 2, 2026
@dimaMachina dimaMachina added this pull request to the merge queue Apr 2, 2026
Merged via the queue into main with commit f0081f7 Apr 2, 2026
26 checks passed
@dimaMachina dimaMachina deleted the feat/dockerize-visual-tests-2 branch April 2, 2026 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants