Make mcp serve non-blocking under concurrent clients by avelino · Pull Request #52 · avelino/mcp

avelino · 2026-04-09T11:12:19Z

fixed: #51

The proxy used to serialize everything through a single ProxyServer mutex held across backend I/O, so one slow backend or dead client could freeze every other session. Under opencode it also leaked child processes (13 chroma-mcp, 4 roam-tui, etc) because shutdown timeouts would silently give up and Command wasn't reaped on drop.

Rewrote the hot path around Arc and a three-phase dispatch that never holds the proxy lock across await: resolve routing under a brief lock, connect (if needed) with the lock released, then call_tool fully outside the lock. Transport trait is now &self + Sync so a single client instance is shared across concurrent tasks; StdioTransport runs a writer task + a reader task with a pending-id map, so multiple in-flight requests on the same backend multiplex through one pipe. Discovery moved out of the proxy lock entirely with its own discovery_lock to dedupe concurrent batches without blocking request handlers for already-discovered backends. Reaper runs in parallel via JoinSet, respects a warm-up grace so fresh backends aren't killed before first use, and force-kills via kill_on_drop when shutdown stalls. Every tokio Command has kill_on_drop(true) so orphans are impossible by construction. SSE legacy path bounds tx.send with a 5s timeout and evicts wedged sessions instead of pinning handlers forever; ping task uses try_send. HTTP listener binds with TCP keepalive 30s/10s so dead clients are detected in ~60s instead of 2h, and a per-request proxy timeout (MCP_PROXY_REQUEST_TIMEOUT, default 120s) is the final bound.

fixed: #51 The proxy used to serialize everything through a single ProxyServer mutex held across backend I/O, so one slow backend or dead client could freeze every other session. Under opencode it also leaked child processes (13 chroma-mcp, 4 roam-tui, etc) because shutdown timeouts would silently give up and Command wasn't reaped on drop. Rewrote the hot path around Arc<McpClient> and a three-phase dispatch that never holds the proxy lock across await: resolve routing under a brief lock, connect (if needed) with the lock released, then call_tool fully outside the lock. Transport trait is now &self + Sync so a single client instance is shared across concurrent tasks; StdioTransport runs a writer task + a reader task with a pending-id map, so multiple in-flight requests on the same backend multiplex through one pipe. Discovery moved out of the proxy lock entirely with its own discovery_lock to dedupe concurrent batches without blocking request handlers for already-discovered backends. Reaper runs in parallel via JoinSet, respects a warm-up grace so fresh backends aren't killed before first use, and force-kills via kill_on_drop when shutdown stalls. Every tokio Command has kill_on_drop(true) so orphans are impossible by construction. SSE legacy path bounds tx.send with a 5s timeout and evicts wedged sessions instead of pinning handlers forever; ping task uses try_send. HTTP listener binds with TCP keepalive 30s/10s so dead clients are detected in ~60s instead of 2h, and a per-request proxy timeout (MCP_PROXY_REQUEST_TIMEOUT, default 120s) is the final bound. Signed-off-by: Avelino <[email protected]>

Copilot

Pull request overview

This PR refactors the MCP proxy/server execution path to avoid holding global locks across backend I/O, enabling concurrent clients to make progress independently while also tightening bounds on hung requests and improving backend process cleanup.

Changes:

Reworks the Transport API to be &self + Send + Sync and updates McpClient to be shareable (Arc<dyn Transport>, AtomicU64 ids).
Implements a multiplexed StdioTransport (writer/reader tasks + pending-id map) to allow multiple in-flight requests on one backend process.
Refactors mcp serve dispatch/discovery/reaping for non-blocking concurrency (separate discovery lock, parallel shutdown, TCP keepalive, per-request proxy timeout) and updates docs/config references accordingly.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
src/transport/stdio.rs	Adds multiplexed stdio transport with concurrent in-flight request support.
src/transport/mod.rs	Updates `Transport` trait to `Send + Sync` with `&self` methods.
src/transport/http.rs	Moves mutable HTTP transport state behind mutexes for shared use.
src/transport/cli.rs	Makes CLI transport discovery/request handling concurrency-safe and adds `kill_on_drop` for tool invocations.
src/serve.rs	Major non-blocking proxy refactor: 3-phase dispatch, discovery lock, parallel reaper, keepalive listener, per-request timeout, updated health.
src/client.rs	Makes `McpClient` shareable and thread-safe with `Arc` transport + atomic ids.
src/main.rs	Adjusts CLI connection code for the new shared client API.
README.md	Updates proxy-mode feature claims and guarantees.
docs/reference/environment-variables.md	Documents `MCP_PROXY_REQUEST_TIMEOUT`.
docs/reference/architecture.md	Updates architecture/concurrency model documentation for new transport + proxy model.
docs/mcp-servers-are-draining-your-hardware.md	Adds April 2026 update describing multi-client orchestration.
docs/guides/proxy-mode.md	Documents the new concurrency model, health metrics, and reaping behavior.
Cargo.toml	Adds `socket2` dependency for TCP keepalive listener binding.
Cargo.lock	Records `socket2` addition (now multiple versions present).

Comments suppressed due to low confidence (1)

src/transport/cli.rs:118

cli_discovery::discover_tools(...) ultimately runs Command::output() under a timeout, but src/cli_discovery.rs currently spawns those Commands without kill_on_drop(true). If the timeout fires (or the task is cancelled), those discovery children can still leak/orphan, which undermines the PR’s “no orphans/kill_on_drop everywhere” guarantee. Consider adding kill_on_drop(true) in the discovery helper(s) as well.

        let results = cli_discovery::discover_tools(
            &self.command,
            &self.base_args,
            &self.env,
            &self.help_flag,

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Avelino <[email protected]>

avelino requested a review from Copilot April 9, 2026 11:14

Copilot started reviewing on behalf of avelino April 9, 2026 11:15 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Address PR review feedback on non-blocking proxy

8380ab2

Signed-off-by: Avelino <[email protected]>

avelino merged commit 4326802 into main Apr 9, 2026
4 checks passed

avelino deleted the avelino/issue-51 branch April 9, 2026 12:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make mcp serve non-blocking under concurrent clients#52

Make mcp serve non-blocking under concurrent clients#52
avelino merged 2 commits intomainfrom
avelino/issue-51

avelino commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

avelino commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants