Telegram Community MCP

MCP server for hybrid search over Telegram community message history. Connect it to Claude Desktop and search your chats by meaning, not just keywords.

What it does

Hybrid search — combines full-text search (FTS5) with semantic vector search (sentence embeddings), merged via Reciprocal Rank Fusion
MCP integration — Claude Desktop calls search tools directly, reasons over results, and pulls conversation threads for context
Incremental sync — checkpoint-based ingestion, only fetches new messages after initial import

How it works

Claude Desktop  ←→  MCP Server (stdio)  ←→  SQLite (FTS5 + sqlite-vec)
                                         ←→  SentenceTransformer (embeddings)
                                         ←→  Telegram API (sync)

Search modes:

| Mode | How it works | Best for | |------|-------------|----------| | fts | SQLite FTS5 with unicode tokenization | Exact word/phrase lookup | | semantic | KNN over 384-dim embeddings (paraphrase-multilingual-MiniLM-L12-v2) | Finding messages by meaning, cross-language | | hybrid | Both FTS + semantic, merged with RRF (default) | General search — best of both worlds |

The embedding model is multilingual (50+ languages, ~120 MB) and runs on CPU. A query in Russian will find answers written in English and vice versa.

Performance

Tested on a mini PC (Intel N100, 16 GB RAM):

| Messages | DB size | FTS speed | Semantic speed | RAM usage | |----------|---------|-----------|----------------|-----------| | 100K | ~200 MB | < 50 ms | < 500 ms | ~800 MB | | 500K | ~1 GB | < 50 ms | ~1 sec | ~1.2 GB | | 1M | ~2 GB | < 50 ms | 2–5 sec | ~2 GB |

Semantic search uses a two-phase scheme: a coarse binary (Hamming) KNN over a bit[384] index ~32x smaller than the fp32 vectors, then an exact fp32 rerank of the top candidates. The small binary index stays cache-resident, which keeps the cold first-query latency low (e.g. on 1.5M vectors: cold semantic ~2 s vs ~12 s for a full fp32 scan; warm hybrid ~0.9 s). FTS5 scales to millions without issues. The binary index is built from existing vectors — no re-embedding — via python scripts/ingest.py --build-binary.

Initial ingestion of 120K messages takes ~90 minutes on CPU (embedding generation). Incremental syncs are near-instant.

Quick start

Prerequisites

Python 3.11+
uv package manager

1. Install

git clone https://github.com/nullnumber1/Telegram-Community-MCP.git
cd Telegram-Community-MCP
uv sync

2. Get Telegram API credentials

Go to my.telegram.org → API development tools → Create application.

Troubleshooting: my.telegram.org often returns a generic ERROR when creating an app in a regular browser. This is a known issue. Try using a VPN (different regions), an antidetect browser, or a mobile browser. It may take several attempts.

Save your api_id and api_hash.

3. Configure

cp config.env.example config.env

Edit config.env: ``env TELEGRAM_API_ID=your_api_id TELEGRAM_API_HASH=your_api_hash CHAT_IDS=-1001234567890,-1009876543210 ``

To find chat IDs, run auth first, then: ``bash make chats ``

4. Authorize

make auth

Scan the QR code with Telegram (Settings → Devices → Link Desktop Device). Session is saved locally — you only need to do this once.

5. Ingest messages

make ingest

This fetches the full history of configured chats and builds the search index. Progress is printed to stdout. Safe to interrupt — resumes from the last checkpoint.

6. Connect to Claude Desktop

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "tg-community-search": {
      "command": "uv",
      "args": ["run", "--project", "/absolute/path/to/Telegram-Community-MCP", "python", "server.py"]
    }
  }
}

Restart Claude Desktop. The search tools should appear in the tools menu.

MCP tools

| Tool | Description | Key parameters | |------|-------------|----------------| | search | Search messages across all indexed chats | query, mode (fts/semantic/hybrid), limit, chat_id, date_from, date_to | | get_context | Get surrounding thread: messages before/after + replies | message_id, window | | sync | Fetch new messages from Telegram | chat_id (optional — all chats if omitted) | | list_chats | Show indexed chats with message counts | — | | get_stats | Index statistics: totals, DB size, per-chat breakdown | — |

Project structure

├── server.py              # MCP server entry point
├── src/
│   ├── db.py              # SQLite: schema, CRUD, FTS5, sqlite-vec queries
│   ├── embedder.py        # SentenceTransformer wrapper (lazy-loading)
│   ├── search.py          # Hybrid search: FTS + KNN + RRF fusion
│   └── telegram.py        # Telethon client wrapper
├── scripts/
│   ├── auth.py            # One-time Telegram authorization (QR code)
│   ├── ingest.py          # Full import / incremental import
│   ├── list_chats.py      # List all account dialogs
│   └── monitor.py         # Monitor ingestion progress
├── tests/
│   ├── test_db.py         # Database operation tests
│   ├── test_embedder.py   # Embedder tests
│   └── test_search.py     # Search and RRF fusion tests
├── config.env.example     # Configuration template
├── pyproject.toml         # Dependencies and tool config
├── Makefile               # Dev and deployment shortcuts
└── tg-community-search.service  # systemd unit (for server deployment)

Deployment (optional)

For running on a remote server (e.g., a mini PC):

Edit tg-community-search.service — replace YOUR_USER with your username
Deploy:

   make deploy REMOTE_HOST=192.168.1.42 REMOTE_USER=myuser REMOTE_PASS=mypass

Set up hourly auto-sync via cron on the remote:

   crontab -e
   # Add:
   0 * * * * cd /home/myuser/tg-community-search && ~/.local/bin/uv run python scripts/ingest.py >> logs/cron-sync.log 2>&1

Development

make test     # Run tests
make lint     # Lint and format
make dev      # MCP inspector (browser UI for testing tools)

License

MIT

Telegram Community MCP

Telegram Community MCP

What it does

How it works

Performance

Quick start

Prerequisites

1. Install

2. Get Telegram API credentials

3. Configure

4. Authorize

5. Ingest messages

6. Connect to Claude Desktop

MCP tools

Project structure

Deployment (optional)

Development

License

Related MCP servers

MCP servers by category