fix: Windows support across fs/browse, agent spawn, env, and signals#123
fix: Windows support across fs/browse, agent spawn, env, and signals#123aorumbayev merged 8 commits intomainfrom
Conversation
- Drop Path.home() boundary (loopback + auth-gated; core add_repo already unrestricted). - Return parent, separator, roots so clients don't parse paths. - Include junctions and symlinks; use FILE_ATTRIBUTE_HIDDEN on Windows. - Mirror fix in plugin GitHub repo detection. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
- Stop parsing paths on the client; use server-provided parent/separator/roots. - Home breadcrumb navigates to "~" so server expands correctly on all platforms. - Windows: drive-root picker when multiple roots; correct C:\ breadcrumb target. - Add is_link flag to FsEntry; show junction/symlink badge. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Windows CreateProcess doesn't honour PATHEXT and can't execute .cmd/.bat directly. Add resolve_spawn_command() and route all agent/IDE/ACP spawns through it so Node-based backends (claude, codex, gemini, copilot, ...) and IDE launchers (code.cmd, cursor.cmd) invoke correctly. No-op on POSIX. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Windows spawned processes need SYSTEMROOT, TEMP, APPDATA, etc. to load DLLs and find config. Select the allowlist per platform so POSIX behavior is unchanged while Windows gets the vars its binaries actually require. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
signal.SIGKILL doesn't exist on Windows and os.kill(SIGTERM) is a no-op for detached processes there. Store the Process handle alongside the timeout timer and call proc.terminate()/kill() so the timeout path works on both POSIX and Windows. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Greptile SummaryThis PR fixes five distinct Windows compatibility bugs:
Confidence Score: 5/5Safe to merge; all remaining findings are P2 style/forward-looking concerns that don't break current behaviour. No P0 or P1 defects found. The three comments are: a forward-looking shell injection risk (only affects user-controlled args not currently present), a security boundary removal that is intentional and low-risk on a local server, and a minor behavioural change in PermissionError handling that is actually an improvement. Test coverage is thorough (142 new tests + 4 Windows-gated). src/kagan/server/_plugin_routes.py (removed access-control check), src/kagan/core/_subprocess.py (cmd.exe /c quoting)
|
| Filename | Overview |
|---|---|
| src/kagan/core/_subprocess.py | New helper resolving .cmd/.bat/.ps1 shims via cmd.exe/powershell on Windows; logic is correct, but cmd.exe /c leaves args subject to shell metacharacter interpretation |
| src/kagan/core/_agent.py | Replaces os.kill(SIGKILL) with proc.terminate()/proc.kill() via _AgentTimeout dataclass; correctly fixes Windows incompatibility and also cancels stale timers on unregister |
| src/kagan/server/_system_routes.py | browse_filesystem now returns parent/separator/roots and uses FILE_ATTRIBUTE_HIDDEN; PermissionError during scandir now surfaces as 403 rather than returning an empty entries list |
| src/kagan/server/_plugin_routes.py | Removes Path.home() boundary for repo_path without any replacement access-control check |
| src/kagan/runtime_env.py | Adds Windows-specific essential env set (SYSTEMROOT, APPDATA, PATHEXT, etc.) selected at call time via _essential_env(); POSIX behaviour unchanged via backwards-compat alias |
| packages/web/src/components/shared/path-picker.tsx | Consumes server-provided parent/separator/roots instead of splitting on '/'; buildBreadcrumbs correctly handles Windows drive roots and POSIX paths; home button navigates to '~' |
| src/kagan/server/responses.py | Adds FsEntryResponse and FsBrowseResponse Pydantic models with all required fields; registered in RESPONSE_MODELS dict |
| packages/web/src/lib/api/types.ts | FsEntry gains is_link; FsBrowseResponse gains parent, separator, roots — matches the new server response shape exactly |
| tests/unit/test_agent_kill.py | Comprehensive tests for _kill_agent/_force_kill using a FakeProcess stub; covers terminate, already-dead, timer scheduling, unregister cancellation, and a real subprocess integration test |
| tests/core/unit/test_subprocess_resolve.py | Good coverage of POSIX pass-through, Windows .cmd/.bat/.ps1/.exe wrapping, absolute paths, mixed-case suffixes, and which() fallback |
Sequence Diagram
sequenceDiagram
participant Client as Web Client (PathPicker)
participant Server as /api/fs/browse
participant OS as OS (scandir)
Client->>Server: GET /api/fs/browse?path=~
Server->>OS: _list_dir("~") in thread
OS-->>Server: entries + parent + separator + roots
Server-->>Client: {path, parent, separator, roots, entries}
Note over Client: buildBreadcrumbs(path, separator, roots)
participant Spawner as Agent Spawn
participant Sub as resolve_spawn_command
Spawner->>Sub: resolve_spawn_command("claude", "--flag")
Sub->>Sub: shutil.which("claude") -> claude.cmd
Sub-->>Spawner: ["cmd.exe", "/c", "claude.cmd", "--flag"]
Spawner->>OS: asyncio.create_subprocess_exec(*resolved)
participant Timer as Timeout Timer
participant Proc as Agent Process
Spawner->>Timer: loop.call_later(timeout, _kill_agent, proc)
Timer->>Proc: proc.terminate()
Timer->>Timer: loop.call_later(grace, _force_kill, proc)
Timer->>Proc: proc.kill()
Comments Outside Diff (1)
-
src/kagan/core/_subprocess.py, line 958-960 (link)cmd.exe /cinterprets shell metacharacters in argscmd.exe /cparses the rest of its command line as a shell string, so any arg containing&,|,>,<,^, or%will be interpreted as cmd.exe syntax rather than passed literally to the script. All current call sites use programmer-controlled strings (flags, subcommands), so there is no active injection risk today. But if any argument ever originates from user input — e.g. a task title, branch name, or path fragment passed as a CLI arg — those characters could corrupt or redirect the command.Consider quoting the arguments when building the cmd.exe invocation, or document clearly at the call sites that args must not contain untrusted data.
Prompt To Fix With AI
This is a comment left during a code review. Path: src/kagan/core/_subprocess.py Line: 958-960 Comment: **`cmd.exe /c` interprets shell metacharacters in args** `cmd.exe /c` parses the rest of its command line as a shell string, so any arg containing `&`, `|`, `>`, `<`, `^`, or `%` will be interpreted as cmd.exe syntax rather than passed literally to the script. All current call sites use programmer-controlled strings (flags, subcommands), so there is no active injection risk today. But if any argument ever originates from user input — e.g. a task title, branch name, or path fragment passed as a CLI arg — those characters could corrupt or redirect the command. Consider quoting the arguments when building the cmd.exe invocation, or document clearly at the call sites that args must not contain untrusted data. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/kagan/core/_subprocess.py
Line: 958-960
Comment:
**`cmd.exe /c` interprets shell metacharacters in args**
`cmd.exe /c` parses the rest of its command line as a shell string, so any arg containing `&`, `|`, `>`, `<`, `^`, or `%` will be interpreted as cmd.exe syntax rather than passed literally to the script. All current call sites use programmer-controlled strings (flags, subcommands), so there is no active injection risk today. But if any argument ever originates from user input — e.g. a task title, branch name, or path fragment passed as a CLI arg — those characters could corrupt or redirect the command.
Consider quoting the arguments when building the cmd.exe invocation, or document clearly at the call sites that args must not contain untrusted data.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/kagan/server/_plugin_routes.py
Line: 90-93
Comment:
**`repo_path` boundary check removed with no replacement**
The removed `is_relative_to(Path.home())` guard was the only access-control on this route. Any caller can now supply an arbitrary resolved path (e.g. `/etc`, `C:\Windows`) via `?repo_path=`. On a purely local server the risk is low, but if the server is ever reachable over the network or from an untrusted web origin, this becomes a path-traversal vector for the GitHub repo detection logic. Consider replacing the `Path.home()` check with a configurable allowlist of workspace roots, or at minimum validating that the resolved path corresponds to a registered project/workspace.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/kagan/server/_system_routes.py
Line: 197-198
Comment:
**`PermissionError` during scan now surfaces as 403 (behavioural change)**
Previously a `PermissionError` raised by `os.scandir()` was silently swallowed and the endpoint returned an empty entry list. With this change it now propagates to `_error_response` as a 403 — semantically more correct, but a contract change for callers that relied on "permission denied → empty list". The `except PermissionError as exc: raise PermissionError(str(exc)) from exc` pattern adds no context over a bare `raise` and loses the original traceback.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(core): terminate agent processes via..." | Re-trigger Greptile
| repo_path_param = request.query_params.get("repo_path") | ||
| if repo_path_param: | ||
| repo_path = Path(repo_path_param).resolve() | ||
| if not repo_path.is_relative_to(Path.home()): | ||
| return _err("Path outside allowed boundaries", status=403) | ||
| else: |
There was a problem hiding this comment.
repo_path boundary check removed with no replacement
The removed is_relative_to(Path.home()) guard was the only access-control on this route. Any caller can now supply an arbitrary resolved path (e.g. /etc, C:\Windows) via ?repo_path=. On a purely local server the risk is low, but if the server is ever reachable over the network or from an untrusted web origin, this becomes a path-traversal vector for the GitHub repo detection logic. Consider replacing the Path.home() check with a configurable allowlist of workspace roots, or at minimum validating that the resolved path corresponds to a registered project/workspace.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/kagan/server/_plugin_routes.py
Line: 90-93
Comment:
**`repo_path` boundary check removed with no replacement**
The removed `is_relative_to(Path.home())` guard was the only access-control on this route. Any caller can now supply an arbitrary resolved path (e.g. `/etc`, `C:\Windows`) via `?repo_path=`. On a purely local server the risk is low, but if the server is ever reachable over the network or from an untrusted web origin, this becomes a path-traversal vector for the GitHub repo detection logic. Consider replacing the `Path.home()` check with a configurable allowlist of workspace roots, or at minimum validating that the resolved path corresponds to a registered project/workspace.
How can I resolve this? If you propose a fix, please make it concise.| except PermissionError as exc: | ||
| raise PermissionError(str(exc)) from exc |
There was a problem hiding this comment.
PermissionError during scan now surfaces as 403 (behavioural change)
Previously a PermissionError raised by os.scandir() was silently swallowed and the endpoint returned an empty entry list. With this change it now propagates to _error_response as a 403 — semantically more correct, but a contract change for callers that relied on "permission denied → empty list". The except PermissionError as exc: raise PermissionError(str(exc)) from exc pattern adds no context over a bare raise and loses the original traceback.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/kagan/server/_system_routes.py
Line: 197-198
Comment:
**`PermissionError` during scan now surfaces as 403 (behavioural change)**
Previously a `PermissionError` raised by `os.scandir()` was silently swallowed and the endpoint returned an empty entry list. With this change it now propagates to `_error_response` as a 403 — semantically more correct, but a contract change for callers that relied on "permission denied → empty list". The `except PermissionError as exc: raise PermissionError(str(exc)) from exc` pattern adds no context over a bare `raise` and loses the original traceback.
How can I resolve this? If you propose a fix, please make it concise.Server commit a09e5ac added FsEntryResponse/FsBrowseResponse to RESPONSE_MODELS but the generated TS wasn't refreshed; wire-drift check caught it. Dedup the hand-written FsEntry/FsBrowseResponse in types.ts by re-exporting from the generated module.
Ubuntu / contains /proc, /sys, /lost+found entries whose .resolve() or .exists() can raise OSError subclasses that the narrower catch missed, failing the whole listing on Linux CI. Widen the per-entry guard so a single bad entry skips, not aborts.
Summary
Five Windows-specific bugs fixed across the server, web client, and core spawn/runtime code. All pre-existing tests still pass (1295 non-snapshot); adds 4 Windows-gated tests (skipped on POSIX) and 142 new non-gated tests.
Commits
a09e5ac/api/fs/browse. DropPath.home()boundary; returnparent/separator/roots/is_linkso clients don't parse paths. UseFILE_ATTRIBUTE_HIDDENon Windows; include junctions/symlinks. Mirror fix in plugin GitHub repo detection.fc262e5PathPicker. Consume server-provided fields instead of splitting on/. Home breadcrumb →~(server-expanded). Drive-root picker when multiple roots.Link2badge for junctions.64e1e4f.cmd/.bat/.ps1shims viacmd.exe/powershellat spawn time. WindowsCreateProcessdoesn't honourPATHEXTand can't exec batch files directly — this is whyclaude,codex,copilot,code,cursoretc. fail to invoke despiteshutil.whichfinding them. New helperkagan.core._subprocess.resolve_spawn_command; all agent/IDE/ACP spawn sites routed through it.0004779SYSTEMROOT,TEMP,TMP,APPDATA,LOCALAPPDATA,USERPROFILE,PATHEXT,COMSPEC,WINDIR,HOMEDRIVE/HOMEPATH,PROGRAMFILES(X86),PROGRAMDATA,USERNAME,COMPUTERNAME,USERDOMAINto the subprocess env on Windows. POSIX unchanged.0deeb68Processhandle, notos.kill(SIGKILL).signal.SIGKILLdoesn't exist on Windows (AttributeErrorat timeout);os.kill(SIGTERM)is a no-op for detached processes. Now storesProcessalongside the timer and callsproc.terminate()/proc.kill().Why stacked
The three backend fixes interact on Windows:
claude.cmd.claudestart (needsAPPDATA/SYSTEMROOT).Shipping any one without the others leaves a different symptom on Windows.
Test plan
uv run poe typecheck— 0 errorsuv run pytest tests/ -m "not snapshot"— 1295 passed, 4 skipped (Windows-gated)packages/web:pnpm exec vitest run path-picker— 14/14;pnpm exec tsc --noEmitclean;pnpm run buildcleantest-windowsjob) — needs a real Windows runner to exercise thewindows_ci-marked tests🤖 Generated with Claude Code