Skip to content

fix: Windows support across fs/browse, agent spawn, env, and signals#123

Merged
aorumbayev merged 8 commits intomainfrom
fixes/windows-path-browse
Apr 23, 2026
Merged

fix: Windows support across fs/browse, agent spawn, env, and signals#123
aorumbayev merged 8 commits intomainfrom
fixes/windows-path-browse

Conversation

@aorumbayev
Copy link
Copy Markdown
Member

Summary

Five Windows-specific bugs fixed across the server, web client, and core spawn/runtime code. All pre-existing tests still pass (1295 non-snapshot); adds 4 Windows-gated tests (skipped on POSIX) and 142 new non-gated tests.

Commits

Commit Fix
a09e5ac fix(server): windows-safe /api/fs/browse. Drop Path.home() boundary; return parent/separator/roots/is_link so clients don't parse paths. Use FILE_ATTRIBUTE_HIDDEN on Windows; include junctions/symlinks. Mirror fix in plugin GitHub repo detection.
fc262e5 fix(web): windows-safe PathPicker. Consume server-provided fields instead of splitting on /. Home breadcrumb → ~ (server-expanded). Drive-root picker when multiple roots. Link2 badge for junctions.
64e1e4f fix(core): resolve .cmd/.bat/.ps1 shims via cmd.exe/powershell at spawn time. Windows CreateProcess doesn't honour PATHEXT and can't exec batch files directly — this is why claude, codex, copilot, code, cursor etc. fail to invoke despite shutil.which finding them. New helper kagan.core._subprocess.resolve_spawn_command; all agent/IDE/ACP spawn sites routed through it.
0004779 fix(runtime_env): platform-aware essential env for Windows. Adds SYSTEMROOT, TEMP, TMP, APPDATA, LOCALAPPDATA, USERPROFILE, PATHEXT, COMSPEC, WINDIR, HOMEDRIVE/HOMEPATH, PROGRAMFILES(X86), PROGRAMDATA, USERNAME, COMPUTERNAME, USERDOMAIN to the subprocess env on Windows. POSIX unchanged.
0deeb68 fix(core): terminate agent processes via Process handle, not os.kill(SIGKILL). signal.SIGKILL doesn't exist on Windows (AttributeError at timeout); os.kill(SIGTERM) is a no-op for detached processes. Now stores Process alongside the timer and calls proc.terminate()/proc.kill().

Why stacked

The three backend fixes interact on Windows:

  • Shim resolution lets us invoke claude.cmd.
  • Essential-env lets the spawned claude start (needs APPDATA/SYSTEMROOT).
  • Process-handle termination lets the timeout path work when an agent hangs.

Shipping any one without the others leaves a different symptom on Windows.

Test plan

  • uv run poe typecheck — 0 errors
  • uv run pytest tests/ -m "not snapshot" — 1295 passed, 4 skipped (Windows-gated)
  • packages/web: pnpm exec vitest run path-picker — 14/14; pnpm exec tsc --noEmit clean; pnpm run build clean
  • CI Windows matrix (test-windows job) — needs a real Windows runner to exercise the windows_ci-marked tests
  • Manual smoke: spawn a claude-code task from kagan web on Windows

🤖 Generated with Claude Code

aorumbayev and others added 5 commits April 23, 2026 12:51
- Drop Path.home() boundary (loopback + auth-gated; core add_repo already unrestricted).
- Return parent, separator, roots so clients don't parse paths.
- Include junctions and symlinks; use FILE_ATTRIBUTE_HIDDEN on Windows.
- Mirror fix in plugin GitHub repo detection.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
- Stop parsing paths on the client; use server-provided parent/separator/roots.
- Home breadcrumb navigates to "~" so server expands correctly on all platforms.
- Windows: drive-root picker when multiple roots; correct C:\ breadcrumb target.
- Add is_link flag to FsEntry; show junction/symlink badge.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Windows CreateProcess doesn't honour PATHEXT and can't execute .cmd/.bat
directly. Add resolve_spawn_command() and route all agent/IDE/ACP spawns
through it so Node-based backends (claude, codex, gemini, copilot, ...) and
IDE launchers (code.cmd, cursor.cmd) invoke correctly. No-op on POSIX.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Windows spawned processes need SYSTEMROOT, TEMP, APPDATA, etc.
to load DLLs and find config. Select the allowlist per platform so
POSIX behavior is unchanged while Windows gets the vars its binaries
actually require.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
signal.SIGKILL doesn't exist on Windows and os.kill(SIGTERM) is a no-op
for detached processes there. Store the Process handle alongside the
timeout timer and call proc.terminate()/kill() so the timeout path works
on both POSIX and Windows.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 23, 2026

Greptile Summary

This PR fixes five distinct Windows compatibility bugs: .cmd/.bat/.ps1 shim resolution via cmd.exe/powershell, platform-aware essential env vars, process termination via proc.terminate()/proc.kill() (replacing os.kill(SIGKILL) which doesn't exist on Windows), a Windows-aware filesystem browser endpoint that returns parent/separator/roots, and a matching PathPicker client that stops splitting on /. The changes are well-scoped with good test coverage (142 new tests + 4 Windows-gated).

  • The Path.home() access-control boundary was removed from both the browse endpoint and the GitHub plugin route without a replacement, opening arbitrary path access on all platforms.
  • cmd.exe /c <script.cmd> <args…> leaves args unquoted against cmd.exe metacharacters — safe today with programmer-controlled args, but a forward injection risk if user input ever reaches those call sites.

Confidence Score: 5/5

Safe to merge; all remaining findings are P2 style/forward-looking concerns that don't break current behaviour.

No P0 or P1 defects found. The three comments are: a forward-looking shell injection risk (only affects user-controlled args not currently present), a security boundary removal that is intentional and low-risk on a local server, and a minor behavioural change in PermissionError handling that is actually an improvement. Test coverage is thorough (142 new tests + 4 Windows-gated).

src/kagan/server/_plugin_routes.py (removed access-control check), src/kagan/core/_subprocess.py (cmd.exe /c quoting)

Security Review

  • Path traversal (_plugin_routes.py): The Path.home() boundary check for ?repo_path= was removed without replacement, allowing any filesystem path to reach the GitHub plugin route. Risk is limited to the local server trust model but worth hardening.
  • Shell injection (_subprocess.py): cmd.exe /c <script.cmd> <args> leaves args unquoted against cmd.exe metacharacters. Current callers use only programmer-controlled strings, so no active exploit exists, but the pattern is unsafe for any future user-controlled argument.
  • No secrets or credentials are introduced; the new Windows env allowlist (APPDATA, USERPROFILE, etc.) is intentional and does not expose new sensitive data.

Important Files Changed

Filename Overview
src/kagan/core/_subprocess.py New helper resolving .cmd/.bat/.ps1 shims via cmd.exe/powershell on Windows; logic is correct, but cmd.exe /c leaves args subject to shell metacharacter interpretation
src/kagan/core/_agent.py Replaces os.kill(SIGKILL) with proc.terminate()/proc.kill() via _AgentTimeout dataclass; correctly fixes Windows incompatibility and also cancels stale timers on unregister
src/kagan/server/_system_routes.py browse_filesystem now returns parent/separator/roots and uses FILE_ATTRIBUTE_HIDDEN; PermissionError during scandir now surfaces as 403 rather than returning an empty entries list
src/kagan/server/_plugin_routes.py Removes Path.home() boundary for repo_path without any replacement access-control check
src/kagan/runtime_env.py Adds Windows-specific essential env set (SYSTEMROOT, APPDATA, PATHEXT, etc.) selected at call time via _essential_env(); POSIX behaviour unchanged via backwards-compat alias
packages/web/src/components/shared/path-picker.tsx Consumes server-provided parent/separator/roots instead of splitting on '/'; buildBreadcrumbs correctly handles Windows drive roots and POSIX paths; home button navigates to '~'
src/kagan/server/responses.py Adds FsEntryResponse and FsBrowseResponse Pydantic models with all required fields; registered in RESPONSE_MODELS dict
packages/web/src/lib/api/types.ts FsEntry gains is_link; FsBrowseResponse gains parent, separator, roots — matches the new server response shape exactly
tests/unit/test_agent_kill.py Comprehensive tests for _kill_agent/_force_kill using a FakeProcess stub; covers terminate, already-dead, timer scheduling, unregister cancellation, and a real subprocess integration test
tests/core/unit/test_subprocess_resolve.py Good coverage of POSIX pass-through, Windows .cmd/.bat/.ps1/.exe wrapping, absolute paths, mixed-case suffixes, and which() fallback

Sequence Diagram

sequenceDiagram
    participant Client as Web Client (PathPicker)
    participant Server as /api/fs/browse
    participant OS as OS (scandir)

    Client->>Server: GET /api/fs/browse?path=~
    Server->>OS: _list_dir("~") in thread
    OS-->>Server: entries + parent + separator + roots
    Server-->>Client: {path, parent, separator, roots, entries}
    Note over Client: buildBreadcrumbs(path, separator, roots)

    participant Spawner as Agent Spawn
    participant Sub as resolve_spawn_command

    Spawner->>Sub: resolve_spawn_command("claude", "--flag")
    Sub->>Sub: shutil.which("claude") -> claude.cmd
    Sub-->>Spawner: ["cmd.exe", "/c", "claude.cmd", "--flag"]
    Spawner->>OS: asyncio.create_subprocess_exec(*resolved)

    participant Timer as Timeout Timer
    participant Proc as Agent Process

    Spawner->>Timer: loop.call_later(timeout, _kill_agent, proc)
    Timer->>Proc: proc.terminate()
    Timer->>Timer: loop.call_later(grace, _force_kill, proc)
    Timer->>Proc: proc.kill()
Loading

Comments Outside Diff (1)

  1. src/kagan/core/_subprocess.py, line 958-960 (link)

    P2 cmd.exe /c interprets shell metacharacters in args

    cmd.exe /c parses the rest of its command line as a shell string, so any arg containing &, |, >, <, ^, or % will be interpreted as cmd.exe syntax rather than passed literally to the script. All current call sites use programmer-controlled strings (flags, subcommands), so there is no active injection risk today. But if any argument ever originates from user input — e.g. a task title, branch name, or path fragment passed as a CLI arg — those characters could corrupt or redirect the command.

    Consider quoting the arguments when building the cmd.exe invocation, or document clearly at the call sites that args must not contain untrusted data.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/kagan/core/_subprocess.py
    Line: 958-960
    
    Comment:
    **`cmd.exe /c` interprets shell metacharacters in args**
    
    `cmd.exe /c` parses the rest of its command line as a shell string, so any arg containing `&`, `|`, `>`, `<`, `^`, or `%` will be interpreted as cmd.exe syntax rather than passed literally to the script. All current call sites use programmer-controlled strings (flags, subcommands), so there is no active injection risk today. But if any argument ever originates from user input — e.g. a task title, branch name, or path fragment passed as a CLI arg — those characters could corrupt or redirect the command.
    
    Consider quoting the arguments when building the cmd.exe invocation, or document clearly at the call sites that args must not contain untrusted data.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/kagan/core/_subprocess.py
Line: 958-960

Comment:
**`cmd.exe /c` interprets shell metacharacters in args**

`cmd.exe /c` parses the rest of its command line as a shell string, so any arg containing `&`, `|`, `>`, `<`, `^`, or `%` will be interpreted as cmd.exe syntax rather than passed literally to the script. All current call sites use programmer-controlled strings (flags, subcommands), so there is no active injection risk today. But if any argument ever originates from user input — e.g. a task title, branch name, or path fragment passed as a CLI arg — those characters could corrupt or redirect the command.

Consider quoting the arguments when building the cmd.exe invocation, or document clearly at the call sites that args must not contain untrusted data.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/kagan/server/_plugin_routes.py
Line: 90-93

Comment:
**`repo_path` boundary check removed with no replacement**

The removed `is_relative_to(Path.home())` guard was the only access-control on this route. Any caller can now supply an arbitrary resolved path (e.g. `/etc`, `C:\Windows`) via `?repo_path=`. On a purely local server the risk is low, but if the server is ever reachable over the network or from an untrusted web origin, this becomes a path-traversal vector for the GitHub repo detection logic. Consider replacing the `Path.home()` check with a configurable allowlist of workspace roots, or at minimum validating that the resolved path corresponds to a registered project/workspace.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/kagan/server/_system_routes.py
Line: 197-198

Comment:
**`PermissionError` during scan now surfaces as 403 (behavioural change)**

Previously a `PermissionError` raised by `os.scandir()` was silently swallowed and the endpoint returned an empty entry list. With this change it now propagates to `_error_response` as a 403 — semantically more correct, but a contract change for callers that relied on "permission denied → empty list". The `except PermissionError as exc: raise PermissionError(str(exc)) from exc` pattern adds no context over a bare `raise` and loses the original traceback.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(core): terminate agent processes via..." | Re-trigger Greptile

Comment on lines 90 to 93
repo_path_param = request.query_params.get("repo_path")
if repo_path_param:
repo_path = Path(repo_path_param).resolve()
if not repo_path.is_relative_to(Path.home()):
return _err("Path outside allowed boundaries", status=403)
else:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 security repo_path boundary check removed with no replacement

The removed is_relative_to(Path.home()) guard was the only access-control on this route. Any caller can now supply an arbitrary resolved path (e.g. /etc, C:\Windows) via ?repo_path=. On a purely local server the risk is low, but if the server is ever reachable over the network or from an untrusted web origin, this becomes a path-traversal vector for the GitHub repo detection logic. Consider replacing the Path.home() check with a configurable allowlist of workspace roots, or at minimum validating that the resolved path corresponds to a registered project/workspace.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/kagan/server/_plugin_routes.py
Line: 90-93

Comment:
**`repo_path` boundary check removed with no replacement**

The removed `is_relative_to(Path.home())` guard was the only access-control on this route. Any caller can now supply an arbitrary resolved path (e.g. `/etc`, `C:\Windows`) via `?repo_path=`. On a purely local server the risk is low, but if the server is ever reachable over the network or from an untrusted web origin, this becomes a path-traversal vector for the GitHub repo detection logic. Consider replacing the `Path.home()` check with a configurable allowlist of workspace roots, or at minimum validating that the resolved path corresponds to a registered project/workspace.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +197 to +198
except PermissionError as exc:
raise PermissionError(str(exc)) from exc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 PermissionError during scan now surfaces as 403 (behavioural change)

Previously a PermissionError raised by os.scandir() was silently swallowed and the endpoint returned an empty entry list. With this change it now propagates to _error_response as a 403 — semantically more correct, but a contract change for callers that relied on "permission denied → empty list". The except PermissionError as exc: raise PermissionError(str(exc)) from exc pattern adds no context over a bare raise and loses the original traceback.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/kagan/server/_system_routes.py
Line: 197-198

Comment:
**`PermissionError` during scan now surfaces as 403 (behavioural change)**

Previously a `PermissionError` raised by `os.scandir()` was silently swallowed and the endpoint returned an empty entry list. With this change it now propagates to `_error_response` as a 403 — semantically more correct, but a contract change for callers that relied on "permission denied → empty list". The `except PermissionError as exc: raise PermissionError(str(exc)) from exc` pattern adds no context over a bare `raise` and loses the original traceback.

How can I resolve this? If you propose a fix, please make it concise.

Server commit a09e5ac added FsEntryResponse/FsBrowseResponse to
RESPONSE_MODELS but the generated TS wasn't refreshed; wire-drift
check caught it. Dedup the hand-written FsEntry/FsBrowseResponse in
types.ts by re-exporting from the generated module.
Ubuntu / contains /proc, /sys, /lost+found entries whose .resolve() or
.exists() can raise OSError subclasses that the narrower catch missed,
failing the whole listing on Linux CI. Widen the per-entry guard so a
single bad entry skips, not aborts.
@aorumbayev aorumbayev merged commit da963d0 into main Apr 23, 2026
18 checks passed
@aorumbayev aorumbayev deleted the fixes/windows-path-browse branch April 23, 2026 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants