opencode-docs
A powerful MCP (Model Context Protocol) server for scraping, storing, and searching documentation locally. Built for use with OpenCode, Claude Desktop, and other MCP-compatible AI coding assistants.
Features
- Smart Scraping: Extracts main content from documentation pages with intelligent noise filtering
- Playwright Support: Optional browser-based scraping for JavaScript-rendered sites (React, Vue, Next.js, etc.)
- Recursive Crawling: Automatically discovers and follows internal links to build complete documentation sets
- Full-Text Search: Fast search across all stored documentation using FlexSearch
- OpenAPI/Swagger Import: Import API documentation from OpenAPI specs or Swagger UI pages
- Metadata Extraction: Captures descriptions, keywords, authors, and last-modified dates
- Update Detection: Re-scrape existing docs and see what changed
Table of Contents
- Installation
- Quick Start
- Configuration
- OpenCode Setup
- Claude Desktop Setup
- Available Tools
- Usage Examples
- Migrating to Another Device
- Docker
- Troubleshooting
- Development
- Changelog
Installation
Prerequisites
- Node.js 18+ (check with
node --version) - npm or pnpm
- Git (for cloning)
Step 1: Clone the Repository
git clone https://github.com/salmenkhelifi1/opencode-docs.git
cd opencode-docs
Step 2: Install Dependencies
npm install
Step 3: Build the Project
npm run build
Step 4 (Optional): Install Playwright for JS-rendered Sites
If you need to scrape JavaScript-heavy sites (React, Vue, Next.js docs, etc.):
# Install Playwright
npm install playwright
# Install Chromium browser
npx playwright install chromium
Verify Installation
# Test that the server starts
node dist/index.js
# You should see:
# [opencode-docs] Docs directory: /home/username/.config/opencode/docs
# [opencode-docs] MCP server started (v1.1.0)
# Press Ctrl+C to stop
Quick Start
After installation, add some documentation:
# Start your AI assistant (OpenCode, Claude Desktop, etc.)
# Then use these commands:
# Add Next.js documentation (recursive crawl)
docs_add_url url="https://nextjs.org/docs" recursive=true maxPages=30
# Add Express.js documentation
docs_add_url url="https://expressjs.com/en/starter/installing.html" recursive=true maxPages=30
# Search your docs
docs_search query="middleware"
# List all sources
docs_list
Configuration
OpenCode Setup
Step 1: Find Your Config File
The OpenCode config file is located at:
- Linux/macOS:
~/.config/opencode/opencode.json - Windows:
%APPDATA%\opencode\opencode.json
Step 2: Add the MCP Server
Add the docs MCP server to your config:
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"docs": {
"type": "local",
"command": ["node", "/full/path/to/opencode-docs/dist/index.js"],
"enabled": true
}
}
}
Important: Replace /full/path/to/opencode-docs with the actual path where you cloned the repository.
Step 3: Restart OpenCode
Restart OpenCode to load the new MCP server. You should see the docs tools available.
Full OpenCode Config Example
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"docs": {
"type": "local",
"command": ["node", "/home/username/opencode-docs/dist/index.js"],
"enabled": true
}
}
}
Claude Desktop Setup
Step 1: Find Your Config File
The Claude Desktop config file is located at:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
Step 2: Add the MCP Server
{
"mcpServers": {
"docs": {
"command": "node",
"args": ["/full/path/to/opencode-docs/dist/index.js"]
}
}
}
Step 3: Restart Claude Desktop
Quit and restart Claude Desktop. The docs tools should now be available.
VS Code with Continue Extension
Add to your Continue config (.continue/config.json):
{
"experimental": {
"modelContextProtocolServers": [
{
"transport": {
"type": "stdio",
"command": "node",
"args": ["/full/path/to/opencode-docs/dist/index.js"]
}
}
]
}
}
Available Tools
docs_list
List all available documentation sources stored locally.
docs_list
Output: Shows all sources with their IDs, page counts, and descriptions.
---
docs_search
Search across all local documentation.
docs_search query="authentication"
docs_search query="routing" sourceId="expressjs"
docs_search query="hooks" limit=10
Parameters: | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | query | string | Yes | Search query | | sourceId | string | No | Limit search to specific source | | limit | number | No | Max results (default: 10) |
---
docs_read
Read a specific documentation page or list all pages in a source.
# List all pages in a source
docs_read sourceId="nextjs"
# Read a specific page
docs_read sourceId="nextjs" pagePath="docs-app-getting-started.md"
Parameters: | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | sourceId | string | Yes | Source ID | | pagePath | string | No | Page path (omit to list all pages) |
---
docs_add_url
Add documentation from a URL with optional recursive crawling.
# Single page
docs_add_url url="https://nextjs.org/docs"
# Recursive crawl (follows links)
docs_add_url url="https://nextjs.org/docs" recursive=true maxPages=50 maxDepth=3
# For JavaScript-rendered sites
docs_add_url url="https://react.dev/learn" usePlaywright=true recursive=true
# With URL filter pattern
docs_add_url url="https://docs.example.com" recursive=true urlPattern="/api/"
Parameters: | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | url | string | required | The URL to scrape | | sourceId | string | auto | Custom source ID | | name | string | auto | Display name for the source | | description | string | auto | Description for the source | | usePlaywright | boolean | false | Use Playwright for JS-rendered pages | | recursive | boolean | false | Recursively crawl linked pages | | maxPages | number | 20 | Max pages to crawl (recursive mode) | | maxDepth | number | 2 | Max link depth (recursive mode) | | urlPattern | string | - | Regex pattern to filter URLs |
---
docs_add_sitemap
Crawl an entire documentation site from its sitemap.xml.
docs_add_sitemap sitemapUrl="https://docs.example.com/sitemap.xml"
docs_add_sitemap sitemapUrl="https://docs.example.com/sitemap.xml" maxPages=100 urlPattern="/docs/"
Parameters: | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | sitemapUrl | string | required | Sitemap URL | | sourceId | string | auto | Custom source ID | | name | string | auto | Display name | | maxPages | number | 50 | Max pages to crawl | | urlPattern | string | - | Regex to filter URLs |
---
docs_add_openapi
Import an OpenAPI/Swagger specification from a direct JSON URL.
docs_add_openapi url="https://api.example.com/openapi.json" sourceId="my-api"
---
docs_add_swagger
Import documentation from a Swagger UI page (auto-detects spec URL).
docs_add_swagger url="https://api.example.com/swagger"
---
docs_update
Update/refresh existing documentation by re-scraping.
# Update entire source
docs_update sourceId="nextjs"
# Update single page
docs_update sourceId="nextjs" pagePath="docs.md"
# With Playwright
docs_update sourceId="react" usePlaywright=true
---
docs_preview
Preview scraped content without saving.
docs_preview url="https://example.com/docs"
docs_preview url="https://example.com/docs" showLinks=true usePlaywright=true
---
docs_auth
Manage authentication credentials for API documentation.
# Add bearer token
docs_auth action="add" host="api.example.com" type="bearer" token="your-token"
# Add basic auth
docs_auth action="add" host="api.example.com" type="basic" username="user" password="pass"
# List all credentials
docs_auth action="list"
# Remove credentials
docs_auth action="remove" host="api.example.com"
---
docs_remove
Remove a documentation source and all its pages.
docs_remove sourceId="old-docs" confirm=true
Usage Examples
Add Popular Documentation Sites
# Next.js (React framework)
docs_add_url url="https://nextjs.org/docs" recursive=true maxPages=50 sourceId="nextjs" name="Next.js"
# Express.js (Node.js web framework)
docs_add_url url="https://expressjs.com/en/starter/installing.html" recursive=true maxPages=30 sourceId="expressjs" name="Express.js"
# Node.js API Documentation
docs_add_url url="https://nodejs.org/docs/latest/api/" recursive=true maxPages=40 sourceId="nodejs" name="Node.js"
# n8n (Workflow Automation)
docs_add_url url="https://docs.n8n.io/" recursive=true maxPages=30 sourceId="n8n" name="n8n"
# React (needs Playwright for JS rendering)
docs_add_url url="https://react.dev/learn" usePlaywright=true recursive=true maxPages=30 sourceId="react" name="React"
# Vue.js
docs_add_url url="https://vuejs.org/guide/introduction.html" recursive=true maxPages=30 sourceId="vuejs" name="Vue.js"
# Tailwind CSS
docs_add_url url="https://tailwindcss.com/docs/installation" recursive=true maxPages=50 sourceId="tailwind" name="Tailwind CSS"
Search Examples
# Search all documentation
docs_search query="authentication"
# Search specific source
docs_search query="middleware" sourceId="expressjs"
# Search with limit
docs_search query="hooks" limit=5
# Search for error handling
docs_search query="error handling"
Import API Documentation
# From OpenAPI JSON
docs_add_openapi url="https://petstore.swagger.io/v2/swagger.json" sourceId="petstore"
# From Swagger UI page
docs_add_swagger url="https://api.example.com/swagger-ui"
# With authentication
docs_auth action="add" host="api.mycompany.com" type="bearer" token="my-api-key"
docs_add_openapi url="https://api.mycompany.com/openapi.json" sourceId="internal-api"
Migrating to Another Device
Option 1: Copy Documentation (Recommended)
Copy the entire docs directory to your new device:
# On old device - compress docs
cd ~/.config/opencode
tar -czvf docs-backup.tar.gz docs/
# Transfer docs-backup.tar.gz to new device
# On new device - extract docs
mkdir -p ~/.config/opencode
cd ~/.config/opencode
tar -xzvf docs-backup.tar.gz
Option 2: Re-scrape Documentation
On the new device, after installation:
# Re-add all your documentation sources
docs_add_url url="https://nextjs.org/docs" recursive=true maxPages=50
docs_add_url url="https://expressjs.com/en/starter/installing.html" recursive=true maxPages=30
# ... etc
Full Migration Checklist
- Clone the repository on the new device:
git clone https://github.com/salmenkhelifi1/opencode-docs.git
cd opencode-docs
npm install
npm run build
- Copy configuration (optional, for credentials):
# Copy credentials file if you have API auth saved
scp old-device:~/.config/opencode/docs/credentials.json ~/.config/opencode/docs/
- Copy documentation or re-scrape:
# Copy existing docs
scp -r old-device:~/.config/opencode/docs ~/.config/opencode/
# OR re-scrape (see examples above)
- Configure your AI assistant (OpenCode, Claude Desktop, etc.)
- Test:
docs_list
docs_search query="test"
Storage Location
Documentation is stored in ~/.config/opencode/docs/:
~/.config/opencode/docs/
├── manifest.json # Index of all sources and pages
├── credentials.json # Saved API credentials (if any)
├── nextjs/ # Source directory
│ ├── docs.md
│ ├── docs-app-getting-started.md
│ └── ...
├── expressjs/
│ └── ...
└── nodejs/
└── ...
Supported Documentation Sites
The scraper includes optimized selectors for:
| Framework | Notes | |-----------|-------| | Docusaurus | React docs, many OSS projects | | Nextra | Next.js docs | | GitBook | Many startups use this | | ReadTheDocs | Python projects | | VuePress/VitePress | Vue.js ecosystem | | MkDocs | Material for MkDocs | | Generic HTML | Works with most sites |
For JavaScript-heavy sites, enable Playwright with usePlaywright=true.
Docker
Build the Image
docker build -t opencode-docs .
Run with Volume Mount
docker run -v ~/.config/opencode/docs:/root/.config/opencode/docs opencode-docs
Docker Compose
version: '3.8'
services:
opencode-docs:
build: .
volumes:
- ~/.config/opencode/docs:/root/.config/opencode/docs
stdin_open: true
tty: true
Troubleshooting
Common Issues
"Playwright is not installed"
npm install playwright
npx playwright install chromium
"Failed to fetch URL: 403 Forbidden"
Some sites block scrapers. Try:
- Using Playwright:
usePlaywright=true - Adding a delay between requests (automatic in recursive mode)
"No content extracted"
The site might use JavaScript rendering. Try: ``bash docs_add_url url="..." usePlaywright=true ``
"Command not found: docs_list"
The MCP server isn't configured. Check:
- The path in your config is correct
- The project is built (
npm run build) - Restart your AI assistant
Docs directory not found
Create it manually: ``bash mkdir -p ~/.config/opencode/docs ``
Debug Mode
Run the server directly to see logs:
node /path/to/opencode-docs/dist/index.js
Development
Run in Development Mode
npm run dev
Watch Mode
npm run watch
Clean Build
npm run clean && npm run build
Project Structure
opencode-docs/
├── src/
│ ├── index.ts # MCP server entry point
│ ├── types.ts # TypeScript types
│ ├── services/
│ │ ├── scraper.ts # HTML to Markdown conversion
│ │ ├── crawler.ts # Sitemap and recursive crawling
│ │ ├── storage.ts # File system management
│ │ ├── search.ts # FlexSearch integration
│ │ └── credentials.ts # Auth credential management
│ └── tools/
│ ├── docs-add-url.ts
│ ├── docs-add-sitemap.ts
│ ├── docs-search.ts
│ └── ... (other tools)
├── dist/ # Compiled JavaScript
├── package.json
├── tsconfig.json
└── README.md
License
MIT
Contributing
Contributions are welcome! Please open an issue or submit a pull request.
Changelog
v1.1.0
- Added Playwright support for JS-rendered pages
- Added recursive crawling with link discovery
- Added
docs_updatetool for refreshing documentation - Added
docs_previewtool for testing scrapes - Enhanced content selectors for Docusaurus, Nextra, GitBook, etc.
- Smart content detection with text density scoring
- Improved noise filtering (removes nav, breadcrumbs, edit links)
- Metadata extraction (description, keywords, author, lastModified)
- Title deduplication
v1.0.0
- Initial release
- Basic scraping with cheerio
- Sitemap crawling
- OpenAPI/Swagger import
- Full-text search






