This is a Gemini CLI extension that integrates BrowserMCP, providing your AI agent with the ability to view, navigate, and interact with web pages directly in your browser.
By giving the agent "eyes" and "hands", this extension enables the Gemini CLI to:
- Navigate to specific URLs
- Inspect and read the DOM
- Click buttons and links
- Type text into form fields
- Read browser console logs
This makes the extension incredibly useful for automating UI testing, extracting data from complex web apps, or having the agent perform repetitive web-based workflows on your behalf.
This extension is a lightweight wrapper that connects the Gemini CLI to the BrowserMCP server.
For this to work, you must have the BrowserMCP Chrome extension installed in your active browser session:
- Install the BrowserMCP Chrome Extension. (It takes about 30 seconds.)
- Navigate to the web page you want the agent to interact with.
- Click the BrowserMCP extension icon and click "Connect".
Note: The agent cannot spawn new browser sessions or operate in "headless" mode. It relies entirely on the active, connected tab.
To install this extension into your Gemini CLI, run:
gemini extensions install https://github.com/derailed-dash/browsermcp-extOnce installed, simply start the Gemini CLI and run /mcp to verify the browsermcp tools are loaded.
I am the author of this specific Gemini CLI extension wrapper. However, I am not the author of the BrowserMCP server or the underlying BrowserMCP project it relies upon. Full credit goes to the creators of BrowserMCP.