Per-server lazy discovery for tools/call, resources/read, prompts/get#81
Merged
Per-server lazy discovery for tools/call, resources/read, prompts/get#81
Conversation
On cold start, tools/call was blocking on discovery of ALL backends before responding — a fast CLI call like gh issue view took 5-30s waiting for kubectl, grafana, slack and every other server to finish. Infer the target backend from the namespaced tool name (server__tool) and discover only that server. Listing operations (tools/list, resources/list, prompts/list) still do full batch discovery since the client wants the complete catalog. Falls back to all-backend discovery when the backend can't be inferred from the name. Fixed: #77 Signed-off-by: Avelino <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR implements per-server lazy discovery for single-item requests so cold-start tools/call / resources/read / prompts/get no longer blocks on discovering every configured backend.
Changes:
- Add backend inference (
server__tool→server) and a per-backend “needs discovery” check inProxyServer. - Update request dispatch to discover only the inferred backend for
tools/call,resources/read, andprompts/get(with fallback to full discovery when inference fails). - Add
discover_single_backendand update docs to describe the new discovery behavior.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/serve/proxy.rs |
Adds backend-name inference helper, per-backend undiscovered check, and unit tests for both. |
src/serve/dispatch.rs |
Switches single-item request paths to per-backend discovery (or full discovery fallback). |
src/serve/discovery.rs |
Introduces discover_single_backend and cache persistence for single-backend discovery. |
docs/reference/architecture.md |
Documents per-server lazy discovery behavior and its concurrency implications. |
docs/guides/proxy-mode.md |
Updates proxy-mode guide to reflect the per-server discovery behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Avelino <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
On cold start, tools/call was blocking on discovery of ALL backends before responding — a fast CLI call like gh issue view took 5-30s waiting for kubectl, grafana, slack and every other server to finish.
Infer the target backend from the namespaced tool name (server__tool) and discover only that server. Listing operations (tools/list, resources/list, prompts/list) still do full batch discovery since the client wants the complete catalog. Falls back to all-backend discovery when the backend can't be inferred from the name.
Fixed: #77