Browser Agent MCP Server
A Model Context Protocol (MCP) server that provides browser automation capabilities through Claude Desktop and other MCP clients.
Features
- 🌐 Browser Automation: Control web browsers programmatically
- 🎯 Multiple Engines: Support for Playwright and Selenium
- 📸 Screenshot Capture: Take screenshots of web pages
- 🔍 Content Extraction: Extract text and HTML content
- 🖱️ Interactive Control: Click, type, scroll, and navigate
- ⚡ Fast Setup: One-click startup with
uv - 🔧 Flexible Configuration: Environment variables and command-line options
Quick Start
1. Install Dependencies
# Install uv if not already installed
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and setup the project
git clone <your-repo-url>
cd browser-agent
# Install dependencies and browsers
uv sync
uv run playwright install
2. One-Click Startup
# Start the MCP server
uv run python start.py
Claude Desktop Configuration
1. Install uv
Make sure uv is installed on your system: ``bash curl -LsSf https://astral.sh/uv/install.sh | sh ``
2. Add Browser Agent Server
Option A: From GitHub Repository (Recommended) Use this configuration to run directly from your GitHub repository:
{
"mcpServers": {
"browser-agent": {
"command": "uv",
"args": [
"run",
"--from",
"git+https://github.com/galilio/browser-mcp.git",
"python",
"start.py"
],
"env": {
"BROWSER_ENGINE": "playwright",
"BROWSER_HEADLESS": "false",
"BROWSER_TIMEOUT": "30000"
}
}
}
}
Option B: Generate GitHub Configuration Run the GitHub configuration generator:
uv run python generate_github_config.py
Option C: Generate Local Configuration Run the configuration generator to get the correct local paths:
uv run python generate_config.py
Option D: Manual Local Configuration Add the following configuration to your MCP servers (replace the path with your actual project directory):
{
"mcpServers": {
"browser-agent": {
"command": "uv",
"args": ["run", "python", "/path/to/your/browser-agent/start.py"],
"env": {
"BROWSER_ENGINE": "playwright",
"BROWSER_HEADLESS": "false",
"BROWSER_TIMEOUT": "30000",
"BROWSER_SCREENSHOT_DIR": "/path/to/your/browser-agent/screenshots"
}
}
}
}
3. Alternative Configurations
GitHub Repository - Headless Mode: ``json { "mcpServers": { "browser-agent": { "command": "uv", "args": [ "run", "--from", "git+https://github.com/galilio/browser-mcp.git", "python", "start.py", "--headless" ], "env": { "BROWSER_ENGINE": "playwright" } } } } ``
GitHub Repository - Selenium Engine: ``json { "mcpServers": { "browser-agent": { "command": "uv", "args": [ "run", "--from", "git+https://github.com/galilio/browser-mcp.git", "python", "start.py", "--engine", "selenium" ], "env": { "BROWSER_HEADLESS": "false" } } } } ``
4. Restart Claude Desktop
After adding the configuration, restart Claude Desktop for the changes to take effect.
Usage Examples
Once configured, you can use the browser agent in Claude Desktop:
- "Navigate to https://example.com"
- "Take a screenshot of the page"
- "Click the login button"
- "Type 'hello world' in the search box"
- "Extract all links from the page"
- "Scroll down to the bottom"
- "Fill out a contact form with my information"
Running the MCP Server
Direct Execution
# Basic startup
uv run python start.py
# Headless mode
uv run python start.py --headless
# Using Selenium engine
uv run python start.py --engine selenium
# Custom timeout
uv run python start.py --timeout 60000
# Custom screenshot directory
uv run python start.py --screenshot-dir ./my_screenshots
Environment Variables
Create a .env file based on env.example:
# Browser Configuration
BROWSER_ENGINE=playwright
BROWSER_HEADLESS=false
BROWSER_TIMEOUT=30000
BROWSER_SCREENSHOT_DIR=./screenshots
BROWSER_USER_AGENT=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36
BROWSER_ARGS=--no-sandbox,--disable-dev-shm-usage
# MCP Configuration
MCP_SERVER_NAME=browser-agent
Available Tools
The MCP server provides the following tools:
navigate_to: Navigate to a URLclick_element: Click on an elementtype_text: Type text into an elementget_page_content: Get page content (text or HTML)take_screenshot: Take a screenshotwait_for_element: Wait for an element to appearexecute_javascript: Execute JavaScript codescroll_page: Scroll the pageget_elements: Get elements by selector
Troubleshooting
If the server doesn't start:
- Make sure Python 3.10+ is installed
- Verify that
uv syncanduv run playwright installcompleted successfully - Check that the
start.pyfile is in the correct directory - Try running
uv run python start.py --check-onlyto verify the environment - Ensure
uvis installed and available in your PATH - Verify the absolute path in your MCP configuration is correct
For other MCP clients
GitHub Repository Configuration: ``json { "mcpServers": { "browser-agent": { "command": "uv", "args": [ "run", "--from", "git+https://github.com/galilio/browser-mcp.git", "python", "start.py" ], "env": { "BROWSER_ENGINE": "playwright", "BROWSER_HEADLESS": "false" } } } } ``
Development
Project Structure
browser-agent/
├── browser_agent_mcp/ # Main package
│ ├── __init__.py
│ ├── config.py # Configuration management
│ ├── browser_controller.py # Browser automation logic
│ ├── server.py # MCP server implementation
│ └── main.py # Entry point
├── start.py # One-click startup script
├── generate_config.py # Configuration generator
├── pyproject.toml # Project configuration
├── env.example # Environment variables template
└── README.md # This file
Manual Installation
If you prefer manual installation:
# Install Python dependencies
pip install mcp playwright selenium python-dotenv pydantic beautifulsoup4 requests
# Install Playwright browsers
playwright install
# Install Selenium WebDriver (if using Selenium)
# Download ChromeDriver from: https://chromedriver.chromium.org/
License
This project is licensed under the MIT License.






