MCP Camoufox Scraper Server

stanislav-milchev/mcp-test
0 starsCommunity

Install to Claude Code

This server doesn't publish a one-line install command. Follow the setup in the source repository.

Summary

Enables web scraping with JavaScript disabled and network request monitoring, using Camoufox browser for privacy-focused data extraction.

README.md

MCP Camoufox Scraper Server

A proof-of-concept MCP (Model Context Protocol) server that uses Camoufox for web scraping with JavaScript disabled and network request monitoring.

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.10 or higher
  • Poetry (Python dependency manager)
  • macOS, Linux, or Windows

1. Clone/Download the Project

git clone <your-repo> mcp-camoufox-scraper
cd mcp-camoufox-scraper

2. Install Poetry (if not already installed)

curl -sSL https://install.python-poetry.org | python3 -

3. Install Dependencies

poetry install

4. Verify Setup

poetry run python setup_verify.py

You should see all checks pass: `` ๐ŸŽ‰ Setup verification successful! ``

5. Run Full Test

poetry run python test_mcp_server.py

You should see: `` ๐ŸŽ‰ MCP Server POC is ready! The server can now be used with MCP clients to scrape websites with JS disabled. ``

๐Ÿ”Œ Connecting to MCP Clients

Claude Desktop Integration

  1. Find your Claude Desktop config file:
  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json
  1. Add the server configuration:
{
  "mcpServers": {
    "camoufox-scraper": {
      "command": "poetry",
      "args": ["run", "python", "/full/path/to/your/mcp-camoufox-scraper/run_server.py"],
      "cwd": "/full/path/to/your/mcp-camoufox-scraper"
    }
  }
}
  1. Replace the path with your actual project directory:
# Get your full path
pwd
# Copy the output and use it in the config above
  1. Restart Claude Desktop completely (quit and reopen)
  1. Verify connection by asking Claude:

"What MCP servers do you have access to?"

You should see "camoufox-scraper" listed with the available tools.

Other MCP Clients

For other MCP clients, use this server configuration:

  • Command: poetry
  • Arguments: ["run", "python", "/path/to/run_server.py"]
  • Working Directory: /path/to/mcp-camoufox-scraper
  • Communication: stdio

๐Ÿ› ๏ธ Features

  • Dual JavaScript mode: JavaScript enabled for network monitoring, disabled for clean HTML extraction
  • Network request monitoring: Capture all XHR/API calls and HTTP requests made during page load
  • Clean HTML extraction: Get HTML content with JavaScript disabled to avoid dynamic modifications
  • MCP protocol integration: Works with any MCP client (Claude Desktop, etc.)

๐Ÿ“‹ Available Tools

1. navigate_to_url

Navigate to a URL with JavaScript enabled to capture network requests and dynamic content.

Parameters:

  • url (required): The URL to navigate to
  • wait_time (optional): Time to wait after page load in seconds (default: 3)

Claude Example: > "Please navigate to https://example.com and wait 5 seconds"

2. get_page_html

Extract clean HTML content by re-loading the page with JavaScript disabled.

Parameters: None

Claude Example: > "Get the HTML content from the current page"

3. get_network_requests

Get all captured network requests from the last page navigation.

Parameters:

  • filter_type (optional): Filter by request type ("xhr", "fetch", "all") - default: "all"

Claude Example: > "Show me all the network requests that were captured"

4. close_browser

Close the browser and cleanup resources.

Parameters: None

Claude Example: > "Close the browser to free up resources"

๐Ÿ’ก Usage Examples

Basic Web Scraping

You: Navigate to https://news.ycombinator.com
Claude: [Uses navigate_to_url tool]
You: Get the HTML content
Claude: [Uses get_page_html tool and analyzes the content]
You: What network requests were made?
Claude: [Uses get_network_requests tool and shows API calls]

API Discovery

You: Go to https://httpbin.org/headers and show me what requests it makes
Claude: [Navigates and shows network monitoring results]

Content Analysis

You: Navigate to https://example.com and get both the network requests and clean HTML
Claude: [Navigates with JS enabled to capture requests, then extracts HTML with JS disabled]
You: What's the difference between the two modes?
Claude: [Explains that navigation captures dynamic requests while HTML extraction gives clean content]

๐Ÿงช Testing & Verification

Run Full Test Suite

poetry run python test_mcp_server.py

Expected output: `` === Testing MCP Server Tools === โœ“ Navigation successful: Example Domain โœ“ HTML extraction successful โœ“ Network requests retrieval successful โœ“ Complex site navigation successful โœ“ Browser closed successfully ๐ŸŽ‰ MCP Server POC is ready! ``

Test Individual Components

# Test just the Camoufox API
poetry run python test_camoufox_api.py

# Start server manually (for debugging)
poetry run python run_server.py

๐Ÿ”ง Troubleshooting

Quick Diagnosis

# Run the setup verification script first
poetry run python setup_verify.py

This will check your Python version, dependencies, project structure, and generate the correct Claude Desktop configuration.

Server Won't Start

# Check if dependencies are installed
poetry show mcp camoufox

# Run the test to identify issues
poetry run python test_mcp_server.py

Claude Desktop Connection Issues

  1. Check config file location - Make sure you're editing the right file
  2. Use absolute paths - Relative paths won't work
  3. Restart Claude Desktop completely after config changes
  4. Check Claude's developer tools for error messages
  5. Verify Python path - Make sure python command works in terminal

Browser Issues

  • Camoufox download: First run may take time downloading browser binaries
  • Permission errors: Make sure the script has execute permissions (chmod +x run_server.py)
  • Port conflicts: Close other browser automation tools if running

Common Error Messages

  • "No such file or directory" โ†’ Check the path in your config
  • "Permission denied" โ†’ Run chmod +x run_server.py
  • "Module not found" โ†’ Run poetry install

๐Ÿ“ Project Structure

mcp-camoufox-scraper/
โ”œโ”€โ”€ mcp_camoufox_scraper/
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ””โ”€โ”€ server.py              # Main MCP server implementation
โ”œโ”€โ”€ run_server.py              # CLI runner script  
โ”œโ”€โ”€ setup_verify.py            # Setup verification & config generator
โ”œโ”€โ”€ test_mcp_server.py         # Comprehensive test suite
โ”œโ”€โ”€ test_camoufox_api.py       # Camoufox API validation
โ”œโ”€โ”€ pyproject.toml            # Project configuration & dependencies
โ”œโ”€โ”€ poetry.lock               # Locked dependency versions
โ””โ”€โ”€ README.md                 # This file

โš ๏ธ Limitations

  • Dual-context approach: HTML extraction requires re-navigation with JS disabled
  • Single page: Only one page can be active at a time
  • No authentication: Currently no support for login/auth workflows
  • Limited to HTTP/HTTPS: No support for other protocols

๐Ÿ”’ Security & Privacy

  • No data collection: Server runs locally, no data sent to external services
  • Controlled JavaScript execution: JS enabled only when needed for network monitoring
  • Privacy-focused browser: Camoufox is designed for privacy
  • Local execution: All scraping happens on your machine

๐Ÿ› ๏ธ Development & Extension

Modifying the Server

  1. Edit mcp_camoufox_scraper/server.py
  2. Test changes: poetry run python test_mcp_server.py
  3. Verify MCP compliance with your client

Adding New Tools

# In server.py, add to _register_tools():
@self.server.call_tool()
async def handle_call_tool(name: str, arguments: Dict[str, Any]):
    if name == "your_new_tool":
        return await self._your_new_tool(arguments)

Technical Details

  • Browser: Camoufox (privacy-focused Firefox-based)
  • Protocol: MCP (Model Context Protocol)
  • Language: Python 3.10+ with asyncio
  • Dependencies: MCP SDK, Camoufox browser automation

๐Ÿ“ License

MIT License - Feel free to use, modify, and distribute.

๐Ÿค Contributing

This is a proof-of-concept project. Feel free to:

  • Fork and extend for your use cases
  • Submit issues and improvements
  • Share your modifications with the community

---

Happy scraping! ๐Ÿ•ท๏ธ

Related MCP servers

Browse all โ†’