LLM Memory MCP Server

ranjanjyoti152/LLM-MCP
0 starsMITCommunity

Install to Claude Code

This server doesn't publish a one-line install command. Follow the setup in the source repository.

Summary

Enables AI assistants across platforms to share memory, recall facts, preferences, and conversation history, creating a unified cognitive system.

README.md

<div align="center">

🧠 LLM Memory MCP Server

Your AI assistants finally have a shared brain.

One memory. Every platform. Zero context lost.

Save a fact in Cursor β†’ recall it in Claude β†’ search it in VS Code β†’ update it in Gemini β†’ it's everywhere.

![Get Started](#-quick-start) ![Dashboard](#-web-dashboard) ![GitHub Stars](https://github.com/ranjanjyoti152/LLM-MCP/stargazers)

<br>

!Python 3.12 !PostgreSQL 16 !Docker !MCP !Tools !Prompts !License

</div>

---

<div align="center">

πŸ”₯ Why 2,000+ developers are switching to shared AI memory

</div>

| Without LLM Memory | With LLM Memory | |:---:|:---:| | 😀 "I already told Claude my tech stack..." | 🧠 Every AI knows your stack on first message | | 😀 "Cursor doesn't know what I did in Copilot..." | 🧠 Full cross-platform context, always | | 😀 "I keep repeating my preferences..." | 🧠 Preferences auto-detected and saved silently | | 😀 "My AI forgot our entire debugging session..." | 🧠 Conversations preserved with searchable history | | 😀 "I lost that useful code snippet..." | 🧠 Procedural memory stores every pattern |

---

⚑ What Makes This Different

<table> <tr> <td width="50%">

πŸ—οΈ 4-Tier Memory Architecture

Not just a key-value store. A cognitive memory system inspired by human memory:

  • Short-term β€” Working context (auto-expires)
  • Semantic β€” Facts, preferences, decisions (permanent)
  • Episodic β€” Conversation history (searchable)
  • Procedural β€” Code patterns & how-tos

</td> <td width="50%">

πŸ” Hybrid AI Search

Every recall query searches all 4 tiers at once, ranked by:

Score = semantic_similarity Γ— 0.30
      + text_relevance     Γ— 0.20
      + recency            Γ— 0.25
      + importance          Γ— 0.25

Powered by pgvector HNSW + GIN full-text indexes.

</td> </tr> <tr> <td width="50%">

πŸ€– Auto-Injected Intelligence

When any AI connects, it automatically:

  1. Loads your working context on start
  2. Recalls relevant memories for your topic
  3. Silently detects & saves preferences
  4. Saves the conversation on end
  5. Extracts knowledge & consolidates memory

Zero manual prompting required.

</td> <td width="50%">

βš”οΈ Cross-Platform Conflict Resolution

When Cursor says "user prefers tabs" and Claude says "user prefers spaces":

  • πŸ” Auto-detection via vector similarity
  • πŸ“‹ Conflict queue with side-by-side comparison
  • 🎯 4 resolution strategies: keep existing, use new, merge, keep both
  • πŸ“Š Version history for every knowledge change

</td> </tr> </table>

---

πŸš€ Quick Start

60 seconds from zero to shared AI memory.

Prerequisites

  • Docker & Docker Compose
  • Any MCP-compatible AI platform

Option A: One-Command Setup (Recommended)

git clone https://github.com/ranjanjyoti152/LLM-MCP.git
cd LLM-MCP
./setup.sh

The setup script auto-detects Cursor, VS Code, Gemini CLI, Claude Desktop, Windsurf and generates config files.

Option B: Manual

git clone https://github.com/ranjanjyoti152/LLM-MCP.git
cd LLM-MCP
docker compose up -d --build

Verify

docker compose ps
# llm-mcp-postgres    Up (healthy)   0.0.0.0:4569->5432
# llm-mcp-ollama      Up (healthy)   0.0.0.0:9050->9050
# llm-mcp-server      Up             0.0.0.0:4040->4040
# llm-mcp-dashboard   Up             0.0.0.0:4041->4041

First boot takes a couple of minutes. Ollama pulls the nomic-embed-text embedding model (~274MB) before it reports healthy, and the server + dashboard wait on that healthcheck. Watch it with docker compose logs -f ollama. (If Ollama is ever unreachable at request time, the server falls back to a local hash embedder so writes still succeed.)

Try It!

Ask your AI:

"Save a knowledge entry: I prefer Python for backend and TypeScript for frontend."

Switch to any other AI and ask:

"What are my programming language preferences?"

✨ It remembers. Across every platform. Forever.

---

πŸ“Š Web Dashboard

Live at http://localhost:4041 β€” a full-featured memory management UI.

<table> <tr> <td align="center"><b>πŸ“ˆ Overview</b><br><sub>Bento grid metrics, health stats, platform charts</sub></td> <td align="center"><b>🧠 Knowledge</b><br><sub>Search, filter, version history per entry</sub></td> </tr> <tr> <td align="center"><b>πŸ“ Conversations</b><br><sub>Full episodic memory with message threads</sub></td> <td align="center"><b>βš”οΈ Conflicts</b><br><sub>Side-by-side comparison, 1-click resolve</sub></td> </tr> <tr> <td align="center"><b>πŸ• Timeline</b><br><sub>Unified activity feed across all memory types</sub></td> <td align="center"><b>πŸ”§ Maintenance</b><br><sub>Cleanup, consolidate, decay, compress</sub></td> </tr> </table>

8 tabs Β· Dark theme Β· Auto-refresh Β· Chart.js visualizations Β· Conflict resolution UI Β· Version history modals

---

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                           AI PLATFORMS                                   β”‚
β”‚                                                                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Windsurf β”‚ β”‚ Cursor β”‚ β”‚ VS Code β”‚ β”‚ Claude β”‚ β”‚Gemini β”‚ β”‚ Codex  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β”‚
β”‚        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β”‚                                β”‚                                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚ MCP (Streamable HTTP)
                                 β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚         🧠 LLM Memory MCP Server :4040         β”‚
        β”‚                                                β”‚
        β”‚  39 Tools Β· 9 Prompts Β· 3 Resources            β”‚
        β”‚  Auto-injected instructions for every LLM      β”‚
        β”‚  Background scheduler (cleanup/decay/compress)  β”‚
        β”‚  Version tracking Β· Conflict resolution         β”‚
        β”‚                                                β”‚
        β”‚  πŸ“Š Dashboard UI :4041                          β”‚
        β”‚  19 REST endpoints Β· 8-tab interface            β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚       PostgreSQL 16 + pgvector :4569            β”‚
        β”‚                                                β”‚
        β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
        β”‚  β”‚Episodic  β”‚ β”‚ Semantic β”‚ β”‚Short-term β”‚       β”‚
        β”‚  β”‚convos +  β”‚ β”‚knowledge β”‚ β”‚TTL-expire β”‚       β”‚
        β”‚  β”‚messages  β”‚ β”‚+ vectors β”‚ β”‚+ consolid β”‚       β”‚
        β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
        β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
        β”‚  β”‚Proceduralβ”‚ β”‚Versions  β”‚ β”‚Conflicts  β”‚       β”‚
        β”‚  β”‚code snipsβ”‚ β”‚changelog β”‚ β”‚cross-plat β”‚       β”‚
        β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
        β”‚                                                β”‚
        β”‚  HNSW vector index + GIN full-text index       β”‚
        β”‚  Hybrid search: semantic + keyword ranking      β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

---

🎯 Supported Platforms

| Platform | Transport | Status | |:---------|:----------|:------:| | Windsurf | Streamable HTTP | βœ… Ready | | Cursor | Streamable HTTP | βœ… Ready | | VS Code + GitHub Copilot | Streamable HTTP | βœ… Ready | | Claude Desktop | Streamable HTTP / stdio | βœ… Ready | | Gemini CLI | Streamable HTTP | βœ… Ready | | Antigravity (Google) | Streamable HTTP | βœ… Ready | | ChatGPT (MCP-compatible) | Streamable HTTP | βœ… Ready | | Codex (OpenAI) | Streamable HTTP | βœ… Ready | | Any MCP-compatible client | Streamable HTTP | βœ… Ready |

---

πŸ”§ Platform Configuration

<img src="https://img.shields.io/badge/-Windsurf-7c5cfc?style=flat-square" alt="Windsurf"> Windsurf

Option A β€” Via UI: Settings β†’ MCP β†’ Add Server β†’ paste the URL.

Option B β€” Config file (.windsurf/mcp_config.json):

{
  "mcpServers": {
    "llm-memory": {
      "serverUrl": "http://localhost:4040/mcp"
    }
  }
}

---

<img src="https://img.shields.io/badge/-Antigravity-4285F4?style=flat-square&logo=google&logoColor=white" alt="Antigravity"> Antigravity (Google)

Option A β€” Via UI: Go to Settings β†’ MCP Servers β†’ Add and paste the URL.

Option B β€” Via config file (mcp_config.json):

{
  "mcpServers": {
    "llm-memory": {
      "serverUrl": "http://localhost:4040/mcp"
    }
  }
}

---

<img src="https://img.shields.io/badge/-Cursor-000000?style=flat-square&logo=cursor&logoColor=white" alt="Cursor"> Cursor

Option A β€” Via UI: Settings β†’ MCP Servers β†’ Add New MCP Server

Option B β€” Project-level config (.cursor/mcp.json):

{
  "mcpServers": {
    "llm-memory": {
      "url": "http://localhost:4040/mcp"
    }
  }
}

Option C β€” Global config (~/.cursor/mcp.json) β€” applies to all projects.

---

<img src="https://img.shields.io/badge/-VS_Code-007ACC?style=flat-square&logo=visualstudiocode&logoColor=white" alt="VS Code"> VS Code + GitHub Copilot

Option A β€” Via Command Palette: Ctrl+Shift+P β†’ MCP: Add Server β†’ HTTP β†’ enter http://localhost:4040/mcp

Option B β€” Workspace config (.vscode/mcp.json):

{
  "servers": {
    "llm-memory": {
      "type": "http",
      "url": "http://localhost:4040/mcp"
    }
  }
}

Option C β€” User settings (global): Add the same config to your VS Code user settings.json under "mcp".

---

<img src="https://img.shields.io/badge/-Gemini_CLI-8E75B2?style=flat-square&logo=googlegemini&logoColor=white" alt="Gemini"> Gemini CLI

Edit ~/.gemini/settings.json:

{
  "mcpServers": {
    "llm-memory": {
      "httpUrl": "http://localhost:4040/mcp"
    }
  }
}

---

<img src="https://img.shields.io/badge/-Claude_Code-D4A574?style=flat-square" alt="Claude Code"> Claude Code (CLI)

Option A β€” One command (HTTP):

claude mcp add --transport http llm-memory http://localhost:4040/mcp

Option B β€” Local via Docker (stdio):

claude mcp add llm-memory -- docker exec -i llm-mcp-server python server.py stdio

Add --scope user to either command to make the server available across all your projects (default scope is the current project). Verify with claude mcp list.

A project-memory skill also ships in .claude/skills/ β€” with the server connected, the recall/save/compact behavior triggers automatically, like installing a skill.

---

<img src="https://img.shields.io/badge/-Claude-D4A574?style=flat-square" alt="Claude"> Claude Desktop

Option A β€” Local (Best Performance): Connect directly via Docker β€” no extra tools needed.

Go to Settings β†’ Developer β†’ Edit Config (claude_desktop_config.json):

{
  "mcpServers": {
    "llm-memory": {
      "command": "docker",
      "args": [
        "exec",
        "-i",
        "llm-mcp-server",
        "python",
        "server.py",
        "stdio"
      ]
    }
  }
}

---

<img src="https://img.shields.io/badge/-ChatGPT-74AA9C?style=flat-square&logo=openai&logoColor=white" alt="ChatGPT"> ChatGPT / Codex / Other MCP Clients

For any platform that supports MCP via HTTP, use:

Endpoint:   http://localhost:4040/mcp
Transport:  Streamable HTTP (JSON-RPC over POST with optional SSE streaming)

---

πŸ› οΈ 39 MCP Tools

<details open> <summary><b>πŸ’¬ Conversations (Episodic Memory)</b></summary>

| Tool | What it does | |:-----|:------------| | save_conversation | Save full conversation with messages, metadata, importance, outcome | | search_memory | Full-text + semantic search across all conversations | | get_recent_conversations | Latest conversations by platform | | get_conversation_by_id | Retrieve specific conversation with all messages | | add_message_to_conversation | Append messages to existing conversation | | tag_conversation | Add/remove tags | | delete_memory | Delete conversation or knowledge by ID |

</details>

<details open> <summary><b>🧠 Knowledge (Semantic Memory)</b></summary>

| Tool | What it does | |:-----|:------------| | save_knowledge | Store fact/preference/instruction/decision | | save_knowledge_smart | Conflict-aware save β€” detects duplicates & cross-platform conflicts | | search_knowledge | Search by query, category, tags | | list_all_knowledge | Paginated listing with category filter | | get_knowledge_by_category | All entries in a category | | get_related_knowledge | Similar entries by vector proximity | | update_knowledge | Update with automatic version snapshot | | auto_extract_preferences | Batch-extract preferences from conversation text | | get_context_summary | Combined knowledge + conversation context |

</details>

<details> <summary><b>⏱️ Working Memory (Short-term)</b></summary>

| Tool | What it does | |:-----|:------------| | save_short_term_memory | Save transient context with TTL auto-expiry | | get_working_context | Load all active session context | | consolidate_memories | Promote important STM β†’ long-term knowledge |

</details>

<details> <summary><b>πŸ’» Code & Projects (Procedural Memory)</b></summary>

| Tool | What it does | |:-----|:------------| | save_code_snippet | Save reusable code with language, tags, description | | search_code_snippets | Search by keyword, language, tags | | save_project_context | Save project-level tech stack & architecture | | get_project_context | Retrieve project context by name |

</details>

<details> <summary><b>πŸ” Search & Retrieval</b></summary>

| Tool | What it does | |:-----|:------------| | recall | PRIMARY β€” searches all 4 memory tiers at once, ranked by composite score; pass project to boost the active repo | | search_by_tags | Cross-type tag search | | compact_context | Token saver β€” offloads a bulky context block into memory, returns a dense summary + recall handle |

</details>

<details> <summary><b>βš”οΈ Versioning & Conflicts</b></summary>

| Tool | What it does | |:-----|:------------| | knowledge_history | Full version timeline for any knowledge entry | | rollback_knowledge | Restore to any previous version | | list_conflicts | View pending/resolved cross-platform conflicts | | resolve_conflict | Resolve with strategy: keep_existing, use_new, merge, keep_both |

</details>

<details> <summary><b>πŸ”§ Maintenance & Utility</b></summary>

| Tool | What it does | |:-----|:------------| | count_memories | Count all memory types | | summarize_platform_activity | Per-platform stats | | cleanup_expired_memories | Remove expired STM & knowledge | | decay_memories | Reduce importance of old unaccessed memories | | export_memories | Full JSON backup | | import_memories | Restore from backup (with dedup) | | clear_platform_data | Delete all data for a platform ⚠️ |

</details>

πŸ“‘ 3 MCP Resources

| URI | Description | |:----|:-----------| | memory://stats | Database statistics & counts | | memory://platforms | All platforms with stored data | | memory://health | System health across all memory tiers |

🎯 9 Smart Prompts

Auto-discoverable prompt templates for key workflows:

| Prompt | What it does | |:-------|:-----------| | start_conversation | Initialize with full memory context | | end_conversation | Save everything + extract knowledge | | compact_now | Offload long context into memory to cut token usage | | save_user_preference | Structured preference storage | | recall_everything | Deep search across all memory | | resolve_all_conflicts | Guided conflict resolution | | memory_maintenance | Run all maintenance tasks | | onboard_new_user | First-time setup & preference capture | | debug_session | Context-aware debugging workflow |

πŸ’¬ Invoking prompts as commands

MCP prompts are exposed as slash commands, but the exact syntax depends on the platform. The server is registered as llm-memory in all the configs above. Prompt arguments are passed space-separated after the command.

<details open> <summary><b><img src="https://img.shields.io/badge/-Claude_Code-D4A574?style=flat-square" alt="Claude Code"> Claude Code (CLI)</b></summary>

Prompts appear as /mcp__<server>__<prompt>:

/mcp__llm-memory__start_conversation claude-code "auth refactor"
/mcp__llm-memory__recall_everything "database decisions"
/mcp__llm-memory__compact_now my-repo claude-code
/mcp__llm-memory__end_conversation claude-code "Auth refactor" success

Run /mcp to list connected servers and browse their prompts. You usually don't need these β€” with the server connected, recall/save/compact happen automatically β€” but the commands are there for explicit control.

</details>

<details> <summary><b><img src="https://img.shields.io/badge/-VS_Code-007ACC?style=flat-square&logo=visualstudiocode&logoColor=white" alt="VS Code"> VS Code + GitHub Copilot</b></summary>

Prompts appear in Copilot Chat as /mcp.<server>.<prompt>:

/mcp.llm-memory.start_conversation
/mcp.llm-memory.recall_everything

Type / in the chat box to see the list; the chat will prompt you for each argument.

</details>

<details> <summary><b><img src="https://img.shields.io/badge/-Claude-D4A574?style=flat-square" alt="Claude"> Claude Desktop</b></summary>

Click the + (attachments) button in the message box, choose llm-memory, then pick a prompt from the list. Fill in the arguments when prompted. Prompts surface as reusable templates rather than typed slash commands.

</details>

<details> <summary><b><img src="https://img.shields.io/badge/-Gemini_CLI-8E75B2?style=flat-square&logo=googlegemini&logoColor=white" alt="Gemini"> Gemini CLI</b></summary>

MCP prompts register as slash commands directly:

/start_conversation
/recall_everything

Run /mcp to view connected servers and their available prompts.

</details>

<details> <summary><b><img src="https://img.shields.io/badge/-Cursor-000000?style=flat-square&logo=cursor&logoColor=white" alt="Cursor"> Cursor / <img src="https://img.shields.io/badge/-Windsurf-7c5cfc?style=flat-square" alt="Windsurf"> Windsurf / ChatGPT</b></summary>

These clients focus on auto-invoked tools rather than slash-command prompts. Just describe what you want in natural language and the model calls the underlying tools:

"Recall everything you know about this project's database decisions."
"Save this preference: I always use async/await."
"Compact this conversation into memory to save tokens."

The same recall / save_knowledge_smart / compact_context tools run underneath.

</details>

---

🧬 Auto-Injected Behaviors

When any AI connects to this MCP server, it automatically receives behavioral instructions β€” no user action needed:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  CONVERSATION START (automatic)                              β”‚
β”‚  1. get_working_context() β€” load session context             β”‚
β”‚  2. recall("<topic>") β€” search all memory for relevance      β”‚
β”‚  3. Personalize response using recalled memories             β”‚
β”‚  4. save_short_term_memory() β€” track current task            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  DURING CONVERSATION (automatic, silent)                     β”‚
β”‚  β€’ Detect preferences β†’ save_knowledge_smart()               β”‚
β”‚  β€’ Detect facts β†’ save_knowledge_smart()                     β”‚
β”‚  β€’ Detect decisions β†’ save_knowledge_smart()                 β”‚
β”‚  β€’ Detect code patterns β†’ save_code_snippet()                β”‚
β”‚  β€’ All saves are conflict-aware (dedup + cross-platform)     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  CONVERSATION END (automatic)                                β”‚
β”‚  1. save_conversation() β€” with importance + outcome          β”‚
β”‚  2. auto_extract_preferences() β€” batch knowledge extraction  β”‚
β”‚  3. consolidate_memories() β€” promote STM β†’ long-term         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Result: Every AI assistant becomes memory-aware from the moment it connects. No setup. No prompting. It just works.

---

πŸ“ Project Structure

LLM-MCP/
β”œβ”€β”€ server.py               # MCP server β€” 39 tools, 9 prompts, 3 resources
β”œβ”€β”€ db.py                   # Async DB layer (asyncpg + pgvector + FTS)
β”œβ”€β”€ embeddings.py           # Embedding engine (local/ollama/openai)
β”œβ”€β”€ dashboard.py            # REST API for web dashboard (Starlette)
β”œβ”€β”€ static/
β”‚   └── index.html          # Dashboard UI (Tailwind + Chart.js)
β”œβ”€β”€ prompts/
β”‚   β”œβ”€β”€ system_prompt.md    # Standalone system prompt for any LLM
β”‚   └── quick_prompts.md    # 12 copy-paste prompt templates
β”œβ”€β”€ docker-compose.yml      # PostgreSQL + MCP Server + Dashboard
β”œβ”€β”€ Dockerfile              # Python 3.12 slim container
β”œβ”€β”€ setup.sh                # One-command auto-setup script
β”œβ”€β”€ .env                    # Environment configuration
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ test_client.py          # End-to-end test suite
β”œβ”€β”€ test_versioning.py      # Versioning & conflict resolution tests
└── test_prompts.py         # MCP prompt discovery tests

---

βš™οΈ Configuration

All settings via .env:

| Variable | Default | Description | |:---------|:--------|:------------| | POSTGRES_PORT | 4569 | PostgreSQL host port | | MCP_PORT | 4040 | MCP server port | | DASHBOARD_PORT | 4041 | Dashboard UI port | | POSTGRES_USER | mcp_user | Database user | | POSTGRES_PASSWORD | mcp_secure_pass_2026 | Database password | | POSTGRES_DB | mcp_memory | Database name | | EMBEDDING_PROVIDER | ollama | local / ollama / openai | | OLLAMA_PORT | 9050 | Host port for the bundled Ollama API | | OLLAMA_MODEL | nomic-embed-text | Embedding model Ollama pulls on first boot (~274MB) | | OLLAMA_DIM | 768 | Vector dimension β€” change only if you swap to a non-768-dim model | | MAINTENANCE_INTERVAL_MINUTES | 30 | Background scheduler interval |

LAN Access

Replace localhost with your machine's IP for remote AI platforms:

http://192.168.x.x:4040/mcp       # MCP Server
http://192.168.x.x:4041            # Dashboard

---

πŸ—„οΈ Database Schema

8 tables with hybrid search indexes:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  conversations   │────▢│    messages       β”‚  Episodic memory
β”‚  (importance,    β”‚     β”‚  (role, content,  β”‚
β”‚   outcome,       β”‚     β”‚   embedding)      β”‚
β”‚   embedding)     β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   knowledge      │────▢│knowledge_versionsβ”‚  Semantic memory
β”‚  (category,      β”‚     β”‚  (version, diff,  β”‚  + version history
β”‚   version,       β”‚     β”‚   changed_by)     β”‚
β”‚   embedding)     β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚short_term_memory β”‚     β”‚memory_conflicts  β”‚  Working memory
β”‚  (TTL, context,  β”‚     β”‚  (existing vs    β”‚  + conflict tracking
β”‚   consolidated)  β”‚     β”‚   conflicting)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  code_snippets   β”‚     β”‚    projects       β”‚  Procedural memory
β”‚  (language,      β”‚     β”‚  (tech_stack,     β”‚  + project context
β”‚   embedding)     β”‚     β”‚   architecture)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Indexes: HNSW (vector similarity) + GIN (full-text search) + B-tree (importance, expiry) for sub-millisecond hybrid queries.

---

πŸ§ͺ Testing

# Full test suite
python test_client.py

# Versioning & conflict resolution
python test_versioning.py

# MCP prompt discovery
python test_prompts.py

<details> <summary>Manual verification commands</summary>

# Check services
docker compose ps

# PostgreSQL direct query
docker exec llm-mcp-postgres psql -U mcp_user -d mcp_memory \
  -c "SELECT COUNT(*) as knowledge FROM knowledge;"

# MCP server logs
docker logs -f llm-mcp-server

# Dashboard logs
docker logs -f llm-mcp-dashboard

# Restart everything
docker compose restart

</details>

---

πŸ“‹ Docker Commands

| Command | Description | |:--------|:------------| | docker compose up -d --build | Start all services | | docker compose down | Stop all services | | docker compose logs -f mcp-server | Stream server logs | | docker compose logs -f dashboard | Stream dashboard logs | | docker compose down -v | Stop & delete all data ⚠️ |

---

πŸ”’ Security

  • Bind to 127.0.0.1 for local-only: MCP_HOST=127.0.0.1
  • Change POSTGRES_PASSWORD in production
  • Add reverse proxy (nginx/Caddy) with TLS for remote access
  • No auth by default β€” designed for local/trusted network use

---

πŸ—ΊοΈ Roadmap

  • [x] Semantic search with pgvector embeddings
  • [x] Automatic conversation summarization (compression)
  • [x] Memory expiration & archival policies
  • [x] Background maintenance scheduler
  • [x] Multi-tier memory (short-term, semantic, episodic, procedural)
  • [x] Importance scoring & time-based decay
  • [x] One-command auto-setup script
  • [x] Memory versioning & change tracking
  • [x] Cross-platform conflict resolution
  • [x] Web dashboard with real-time visualization
  • [x] Auto-injected behavioral instructions
  • [x] MCP prompt workflows
  • [ ] Authentication / API keys for multi-user
  • [ ] Webhook notifications on new memories
  • [ ] Memory sharing between users
  • [ ] Cloud-hosted option (no Docker needed)
  • [ ] Mobile companion app

---

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

All contributions welcome β€” features, bug fixes, docs, translations.

---

πŸ“„ License

MIT License β€” see LICENSE for details.

---

<div align="center">

⭐ If this project saves you from repeating yourself to your AIs, give it a star!

Star this repo Β· Report Bug Β· Request Feature

<br>

Built with ❀️ by ranjanjyoti152

Stop repeating yourself. Let your AIs share a brain.

<br>

<sub>If you found this useful, consider sharing it with other developers who use multiple AI tools.</sub>

</div>

Related MCP servers

Browse all β†’