WebScrape MCP Server
English · Español
---
English
MCP server that lets AI agents search the web and extract clean Markdown content — no ads, no clutter, just the text your LLM needs.
Tools
| Tool | Description | |------|-------------| | webscrape_search | Search the web (DuckDuckGo) and scrape results into Markdown | | webscrape_fetch_url | Fetch a single URL and return clean Markdown. Supports use_readability and auto-detects PDFs | | webscrape_batch_fetch | Fetch up to 5 URLs in parallel. Supports PDF auto-detection |
Features
- PDF support: URLs ending in
.pdfor withapplication/pdfcontent-type are auto-detected and text is extracted page by page - Readability mode: Pass
use_readability=Truetowebscrape_fetch_urlfor cleaner article extraction using Mozilla Readability (removes nav, sidebars, ads, comments)
- DuckDuckGo search: No API key required, just a search query
- Built-in cache: 200-entry cache with automatic eviction for repeated URLs
- Batch fetching: Up to 5 URLs in parallel
How to use
Option 1 — MCPize (recommended)
- Go to https://mcpize.com/marketplace
- Search Web Scrape and click Start Free
- You'll get an API key
- Configure in your AI client:
{
"mcpServers": {
"webscrape": {
"url": "https://webscrape.mcpize.run",
"headers": {
"Authorization": "Bearer your-api-key"
}
}
}
}
Option 2 — Render (dev)
{
"mcpServers": {
"webscrape": {
"url": "https://webscrape-mcp.onrender.com"
}
}
}
Option 3 — Local
git clone https://github.com/carrasquelalex1/webscrape-mcp.git
cd webscrape-mcp
pip install -r requirements.txt
python webscrape_mcp.py
Official Registry
io.github.carrasquelalex1/webscrape-mcp
Dependencies
mcp, httpx, beautifulsoup4, markdownify, pydantic, ddgs, readability-lxml, PyMuPDF
License
MIT
---
Español
Servidor MCP que permite a agentes de IA buscar en la web y extraer contenido limpio en Markdown — sin anuncios, sin navegación, solo el texto que tu LLM necesita.
Tools
| Tool | Descripción | |------|-------------| | webscrape_search | Busca en la web (DuckDuckGo) y extrae los resultados a Markdown | | webscrape_fetch_url | Obtiene una URL y la convierte a Markdown limpio. Soporta use_readability y detecta PDFs automáticamente | | webscrape_batch_fetch | Obtiene hasta 5 URLs en paralelo. Soporta detección automática de PDFs |
Características
- Soporte PDF: URLs que terminan en
.pdfo con content-typeapplication/pdfse detectan automáticamente y se extrae el texto página por página - Modo Readability: Usá
use_readability=Trueenwebscrape_fetch_urlpara extraer artículos de forma más limpia (elimina navegación, barras laterales, anuncios, comentarios)
- Búsqueda DuckDuckGo: Sin necesidad de API key
- Caché integrada: 200 entradas con evicción automática para URLs repetidas
- Batch fetching: Hasta 5 URLs en paralelo
Cómo usarlo
Opción 1 — MCPize (recomendada)
- Ve a https://mcpize.com/marketplace
- Busca Web Scrape y haz clic en Start Free
- Obtendrás una API key
- Configura en tu cliente de IA:
{
"mcpServers": {
"webscrape": {
"url": "https://webscrape.mcpize.run",
"headers": {
"Authorization": "Bearer tu-api-key"
}
}
}
}
Opción 2 — Render (desarrollo)
{
"mcpServers": {
"webscrape": {
"url": "https://webscrape-mcp.onrender.com"
}
}
}
Opción 3 — Local
git clone https://github.com/carrasquelalex1/webscrape-mcp.git
cd webscrape-mcp
pip install -r requirements.txt
python webscrape_mcp.py
Registro Oficial
io.github.carrasquelalex1/webscrape-mcp
Dependencias
mcp, httpx, beautifulsoup4, markdownify, pydantic, ddgs, readability-lxml, PyMuPDF, playwright
Licencia
MIT






