Skip to content

Make mcp serve non-blocking under concurrent clients#52

Merged
avelino merged 2 commits intomainfrom
avelino/issue-51
Apr 9, 2026
Merged

Make mcp serve non-blocking under concurrent clients#52
avelino merged 2 commits intomainfrom
avelino/issue-51

Conversation

@avelino
Copy link
Copy Markdown
Owner

@avelino avelino commented Apr 9, 2026

fixed: #51

The proxy used to serialize everything through a single ProxyServer mutex held across backend I/O, so one slow backend or dead client could freeze every other session. Under opencode it also leaked child processes (13 chroma-mcp, 4 roam-tui, etc) because shutdown timeouts would silently give up and Command wasn't reaped on drop.

Rewrote the hot path around Arc and a three-phase dispatch that never holds the proxy lock across await: resolve routing under a brief lock, connect (if needed) with the lock released, then call_tool fully outside the lock. Transport trait is now &self + Sync so a single client instance is shared across concurrent tasks; StdioTransport runs a writer task + a reader task with a pending-id map, so multiple in-flight requests on the same backend multiplex through one pipe. Discovery moved out of the proxy lock entirely with its own discovery_lock to dedupe concurrent batches without blocking request handlers for already-discovered backends. Reaper runs in parallel via JoinSet, respects a warm-up grace so fresh backends aren't killed before first use, and force-kills via kill_on_drop when shutdown stalls. Every tokio Command has kill_on_drop(true) so orphans are impossible by construction. SSE legacy path bounds tx.send with a 5s timeout and evicts wedged sessions instead of pinning handlers forever; ping task uses try_send. HTTP listener binds with TCP keepalive 30s/10s so dead clients are detected in ~60s instead of 2h, and a per-request proxy timeout (MCP_PROXY_REQUEST_TIMEOUT, default 120s) is the final bound.

fixed: #51

The proxy used to serialize everything through a single
ProxyServer mutex held across backend I/O, so one slow backend or dead
client could freeze every other session. Under opencode it also leaked
child processes (13 chroma-mcp, 4 roam-tui, etc) because shutdown
timeouts would silently give up and Command wasn't reaped on drop.

Rewrote the hot path around Arc<McpClient> and a three-phase dispatch
that never holds the proxy lock across await: resolve routing under a
brief lock, connect (if needed) with the lock released, then call_tool
fully outside the lock. Transport trait is now &self + Sync so a single
client instance is shared across concurrent tasks; StdioTransport runs
a writer task + a reader task with a pending-id map, so multiple
in-flight requests on the same backend multiplex through one pipe.
Discovery moved out of the proxy lock entirely with its own
discovery_lock to dedupe concurrent batches without blocking request
handlers for already-discovered backends. Reaper runs in parallel via
JoinSet, respects a warm-up grace so fresh backends aren't killed
before first use, and force-kills via kill_on_drop when shutdown stalls.
Every tokio Command has kill_on_drop(true) so orphans are impossible
by construction. SSE legacy path bounds tx.send with a 5s timeout and
evicts wedged sessions instead of pinning handlers forever; ping task
uses try_send. HTTP listener binds with TCP keepalive 30s/10s so dead
clients are detected in ~60s instead of 2h, and a per-request proxy
timeout (MCP_PROXY_REQUEST_TIMEOUT, default 120s) is the final bound.

Signed-off-by: Avelino <[email protected]>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the MCP proxy/server execution path to avoid holding global locks across backend I/O, enabling concurrent clients to make progress independently while also tightening bounds on hung requests and improving backend process cleanup.

Changes:

  • Reworks the Transport API to be &self + Send + Sync and updates McpClient to be shareable (Arc<dyn Transport>, AtomicU64 ids).
  • Implements a multiplexed StdioTransport (writer/reader tasks + pending-id map) to allow multiple in-flight requests on one backend process.
  • Refactors mcp serve dispatch/discovery/reaping for non-blocking concurrency (separate discovery lock, parallel shutdown, TCP keepalive, per-request proxy timeout) and updates docs/config references accordingly.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/transport/stdio.rs Adds multiplexed stdio transport with concurrent in-flight request support.
src/transport/mod.rs Updates Transport trait to Send + Sync with &self methods.
src/transport/http.rs Moves mutable HTTP transport state behind mutexes for shared use.
src/transport/cli.rs Makes CLI transport discovery/request handling concurrency-safe and adds kill_on_drop for tool invocations.
src/serve.rs Major non-blocking proxy refactor: 3-phase dispatch, discovery lock, parallel reaper, keepalive listener, per-request timeout, updated health.
src/client.rs Makes McpClient shareable and thread-safe with Arc transport + atomic ids.
src/main.rs Adjusts CLI connection code for the new shared client API.
README.md Updates proxy-mode feature claims and guarantees.
docs/reference/environment-variables.md Documents MCP_PROXY_REQUEST_TIMEOUT.
docs/reference/architecture.md Updates architecture/concurrency model documentation for new transport + proxy model.
docs/mcp-servers-are-draining-your-hardware.md Adds April 2026 update describing multi-client orchestration.
docs/guides/proxy-mode.md Documents the new concurrency model, health metrics, and reaping behavior.
Cargo.toml Adds socket2 dependency for TCP keepalive listener binding.
Cargo.lock Records socket2 addition (now multiple versions present).
Comments suppressed due to low confidence (1)

src/transport/cli.rs:118

  • cli_discovery::discover_tools(...) ultimately runs Command::output() under a timeout, but src/cli_discovery.rs currently spawns those Commands without kill_on_drop(true). If the timeout fires (or the task is cancelled), those discovery children can still leak/orphan, which undermines the PR’s “no orphans/kill_on_drop everywhere” guarantee. Consider adding kill_on_drop(true) in the discovery helper(s) as well.
        let results = cli_discovery::discover_tools(
            &self.command,
            &self.base_args,
            &self.env,
            &self.help_flag,

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/transport/stdio.rs
Comment thread src/transport/stdio.rs Outdated
Comment thread src/transport/stdio.rs Outdated
Comment thread src/serve.rs Outdated
Comment thread src/serve.rs Outdated
Comment thread src/serve.rs
Comment thread README.md
Comment thread Cargo.toml Outdated
Comment thread src/serve.rs
@avelino avelino merged commit 4326802 into main Apr 9, 2026
4 checks passed
@avelino avelino deleted the avelino/issue-51 branch April 9, 2026 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP process leak: proxy does not reap child backends after clients disconnect

2 participants