YourMemory

sachitrafa/YourMemory
247 starsNOASSERTIONCommunity

Install to Claude Code

This server doesn't publish a one-line install command. Follow the setup in the source repository.

Summary

sachitrafa/YourMemory MCP server](https://glama.ai/mcp/servers/sachitrafa/YourMemory/badges/score.svg)](https://glama.ai/mcp/servers/sachitrafa/YourMemory) 🐍 🏠 🍎 πŸͺŸ 🐧 - Persistent memory for AI agents with Ebbinghaus forgetting-curve decay, hybrid...

README.md

<!-- mcp-name: io.github.sachitrafa/yourmemory --> <div align="center"> <img src="logo.svg.png" alt="YourMemory" width="110" /><br> <h1>YourMemory</h1>

Persistent memory for AI agents β€” built on the science of how humans remember.

![PyPI](https://pypi.org/project/yourmemory/) ![PyPI Downloads](https://pypi.org/project/yourmemory/) ![Python](https://pypi.org/project/yourmemory/) ![License: CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) ![GitHub Stars](https://github.com/sachitrafa/YourMemory) ![GitHub Issues](https://github.com/sachitrafa/YourMemory/issues) ![Last Commit](https://github.com/sachitrafa/YourMemory/commits/main) ![Docker Build](https://github.com/sachitrafa/YourMemory/actions/workflows/docker-publish.yml)

![LoCoMo Recall@5](BENCHMARKS.md) ![LongMemEval Recall@5](BENCHMARKS.md) ![HotpotQA BOTH@5](BENCHMARKS.md) ![oosmetrics](https://oosmetrics.com/repo/sachitrafa/YourMemory)

</div>

---

What Is YourMemory?

Every session, your AI assistant starts from zero. It asks the same questions, forgets your preferences, re-learns your stack. There is no memory between conversations.

YourMemory fixes that with a one-command install that plugs into Claude, Cursor, Cline, Windsurf, or any MCP client. It gives your AI a persistent memory layer modelled on human cognition:

  • Things that matter stick β€” importance score controls how quickly a memory decays
  • Outdated facts get replaced β€” subject-aware deduplication merges or supersedes memories automatically
  • Related context surfaces together β€” entity graph links memories that share people, places, or concepts
  • Old memories fade naturally β€” Ebbinghaus forgetting curve prunes stale context every 24 hours

Zero infrastructure required. SQLite by default, Postgres for teams.

---

Table of Contents

---

Benchmarks

Three external datasets, all scripts open source and reproducible. Full methodology in BENCHMARKS.md.

LongMemEval-S β€” 500 questions, ~53 distractor sessions each

The hardest standard benchmark for long-term memory systems. Each question is backed by ~53 conversation sessions; the model must retrieve the right one(s) from the haystack.

| Metric | Score | |--------|:-----:| | Recall@5 (any gold session in top-5) | 89.4% | | Recall-all@5 (all gold sessions in top-5) | 84.8% | | nDCG@5 (ranking quality) | 87.4% |

By question type (Recall@5):

| Question Type | Recall@5 | n | |---------------|:--------:|:-:| | single-session-assistant | 98.2% | 56 | | knowledge-update | 96.2% | 78 | | multi-session | 95.5% | 133 | | single-session-preference | 90.0% | 30 | | temporal-reasoning | 84.2% | 133 | | single-session-user | 72.9% | 70 |

LoCoMo-10 β€” 1,534 QA pairs across 10 multi-session conversations

Conversations spanning weeks to months. Every system ingests the same session summaries in the same order.

| System | Recall@5 | 95% CI | |--------|:--------:|:------:| | YourMemory (BM25 + vector + graph + decay) | 59% | 56–61% | | Zep Cloud | 28% | 26–30% | | Supermemory | 31% | 28–33% | | Mem0 | 18% | 16–20% |

2Γ— better recall than Zep Cloud across all 10 samples. \* Supermemory and Mem0 exhausted free-tier quotas mid-benchmark; scores computed over full 1,534 pairs using 0 for unfinished samples.

HotpotQA β€” 200 multi-hop questions requiring two facts from different articles

| System | BOTH_FOUND@5 | |--------|:------------:| | YourMemory (vector + BM25 + entity graph) | 71.5% | | YourMemory (no entity edges) | 59.5% |

Entity graph edges add +12 pp β€” they traverse from Fact 1 to Fact 2 even when Fact 2 has low embedding similarity to the query.

Writeup: I built memory decay for AI agents using the Ebbinghaus forgetting curve

---

Quick Start

Supports Python 3.11–3.14. No Docker, no database setup. All memory stored locally in ~/.yourmemory/.

Before you install β€” what this does

| Behavior | Detail | |---|---| | Activation | Requires a one-time token. Visit yourmemoryai.xyz, enter your email, verify with a 6-digit code, and copy your token. | | Global rule injection | yourmemory-setup writes memory instructions into ~/.cursor/rules/memory.mdc and other detected AI client config files (Claude, VS Code, etc.) so the assistant can call memory tools automatically. You can remove these files at any time. | | MCP tool behavior | The recall_memory tool can be called by your AI assistant when persistent context would help. The assistant decides when to call it based on the request. | | Telemetry | A UUID (no personal data) is sent on first setup only. Opt out: YOURMEMORY_TELEMETRY=off |

Activation steps:

  1. Visit yourmemoryai.xyz and enter your email
  2. Check your inbox for a 6-digit verification code
  3. Enter the code on the website β€” your token is shown instantly
  4. Run the three commands below:
pip install yourmemory
yourmemory-register <your-token>
yourmemory-setup

Requirement β€” local model: YourMemory extracts memories with a local model via Ollama. Install Ollama and start it β€” yourmemory-setup then pulls the default model (qwen2.5:7b, ~4.7 GB) automatically. To use a lighter model you already have, set YOURMEMORY_OLLAMA_MODEL (e.g. llama3.2:3b) before setup. Backend: yourmemory-setup asks whether to use DuckDB (zero setup, default) or Postgres (shared/production β€” you provide a DATABASE_URL; needs the pgvector extension).

---

Memory Dashboard

Two built-in browser UIs β€” no extra setup, start automatically with the MCP server.

Memory Browser β€” http://localhost:3033/ui

A full read/write view of everything stored in memory.

| What you see | Details | |---|---| | Stats bar | Total Β· Strong β‰₯50% Β· Fading 5–50% Β· Near prune <10% | | Agent tabs | All / User / per-agent views | | Memory cards | Content Β· strength bar Β· category Β· recall count Β· last accessed | | Filters | Category (fact / strategy / assumption / failure) Β· Sort by strength, recency, recall |

Pass ?user=<id> to pre-load a specific user: http://localhost:3033/ui?user=sachit

Graph Visualiser β€” http://localhost:3033/graph

An interactive force-directed map of how memories connect.

http://localhost:3033/graph?memoryId=42&userId=sachit&depth=2
  • Root memory as a larger cyan node; neighbours color-coded by category
  • Edge thickness = connection strength
  • Click any node for full content; drag, zoom, reposition freely

---

Ask Without Calling the API

The only memory system that can answer questions without making any LLM API call.

yourmemory ask "what database does this project use"
# β†’ YourMemory uses DuckDB locally and Postgres in production.

yourmemory ask "what port does the dashboard run on"
# β†’ 3033

yourmemory ask "how do I fix a kubernetes deployment"
# β†’ Not enough memory context to answer without Claude.

When memory is strong enough, it answers instantly β€” zero tokens, zero cloud cost, zero latency. When it isn't, it declines cleanly rather than hallucinating.

| Query | Mem0 / Zep / LangMem | YourMemory | |---|---|---| | "What port does the server run on?" | Full LLM API call | Instant, $0 | | "What database does this project use?" | Full LLM API call | Instant, $0 | | "How do I fix a k8s deployment?" | Full LLM API call | Declines β†’ Claude | | Privacy | Query sent to cloud | Never leaves your machine |

---

API Proxy β€” Guaranteed Memory

MCP tools are called at the AI's discretion. The API proxy removes that uncertainty β€” it intercepts every LLM call, injects relevant memories automatically, and handles store_memory / update_memory without any model configuration.

Start the YourMemory server (yourmemory), then point your LLM client at localhost:3033:

OpenAI

from openai import OpenAI

client = OpenAI(
    api_key="sk-...",
    base_url="http://localhost:3033/proxy/openai"
)

# Memory is injected automatically β€” no other changes needed
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What database do I use?"}]
)

Anthropic

from anthropic import Anthropic

client = Anthropic(
    api_key="sk-ant-...",
    base_url="http://localhost:3033/proxy/anthropic"
)

response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[{"role": "user", "content": "What database do I use?"}]
)

Per-user memory

Pass X-YourMemory-User to isolate memory per person:

client = OpenAI(
    api_key="sk-...",
    base_url="http://localhost:3033/proxy/openai",
    default_headers={"X-YourMemory-User": "sachit"}
)

How it works

On every request the proxy:

  1. Recalls relevant memories and injects them into the system prompt β€” guaranteed, no tool call needed
  2. Adds store_memory and update_memory as tools β€” the model calls them when it learns something new
  3. Executes those tool calls locally and returns the final response transparently

Streaming note: recall injection works for all requests. Tool call interception (store/update) works for non-streaming requests only β€” streaming passes through and tools execute on the next turn.

---

MCP Tools

Three tools, called by your AI automatically.

| Tool | When your AI calls it | What it does | |------|-----------------------|--------------| | recall_memory(query, current_path?) | Start of every task | Surfaces memories ranked by similarity Γ— decay strength; spatial boost for path-matched memories | | store_memory(content, importance, category?, context_paths?) | After learning something new | Embeds, deduplicates, stores with decay; tags optional file/dir paths | | update_memory(id, new_content, importance) | When a stored fact is outdated | Re-embeds and replaces; logs old content to audit trail |

# Store with spatial context
store_memory(
    "Sachit prefers tabs over spaces in Python",
    importance=0.9,
    category="fact",
    context_paths=["/projects/backend"]
)

# Next session β€” spatial boost fires when working in that directory
recall_memory("Python formatting", current_path="/projects/backend")
# β†’ {"content": "Sachit prefers tabs over spaces in Python", "strength": 0.87}

Memory categories control decay rate

| Category | Half-life | Best for | |----------|-----------|----------| | strategy | ~38 days | Patterns that worked, architectural decisions | | fact | ~24 days | Preferences, identity, stable knowledge | | assumption | ~19 days | Inferred context, uncertain beliefs | | failure | ~11 days | Errors, wrong approaches, environment-specific issues |

---

How It Works

Ebbinghaus Forgetting Curve

Memory strength decays exponentially. Importance and recall frequency slow that decay:

effective_Ξ»  = base_Ξ» Γ— (1 βˆ’ importance Γ— 0.8)
strength     = clamp(importance Γ— e^(βˆ’effective_Ξ» Γ— active_days) Γ— (1 + recall_count Γ— 0.2), 0, 1)
hybrid_score = 0.4 Γ— bm25_norm + 0.6 Γ— cosine_similarity

active_days counts only days the user was active β€” vacations don't cause memory loss. Memories below strength 0.05 are pruned automatically every 24 hours.

Session wrap-up: recalled memory IDs are tracked per session. When a session goes idle (30 min default), those memories get a recall_count boost. Set YOURMEMORY_SESSION_IDLE to change the window.

Recall throttling: identical (user, query) pairs are cached within a configurable window. Set YOURMEMORY_RECALL_COOLDOWN (seconds, default 0 = off).

Hybrid Retrieval: Vector + BM25 + Entity Graph

Retrieval runs in two rounds:

Round 1 β€” Hybrid search: cosine similarity + BM25 keyword scoring, returns top-k candidates above threshold.

Round 2 β€” Graph expansion: BFS traversal from Round 1 seeds surfaces memories that share context but not vocabulary β€” connected via semantic or entity edges.

recall("Python backend")
  Round 1 β†’ [1] Python/MongoDB    (sim=0.61)
             [2] DuckDB/spaCy     (sim=0.19)
  Round 2 β†’ [5] Docker/Kubernetes (sim=0.29 β€” below cut-off, surfaced via shared entity "backend")

Chain-aware pruning: a decayed memory is kept alive if any graph neighbour is above the prune threshold. Related memories age together.

Subject-Aware Deduplication

Before storing, YourMemory checks whether the new memory is about the same entity as the nearest existing one:

"Sachit uses DuckDB"      vs  "YourMemory uses DuckDB"
 subject: Sachit               subject: YourMemory
 β†’ different entities β†’ stored separately βœ“

"YourMemory uses DuckDB"  vs  "YourMemory stores data in DuckDB"
 subject: YourMemory           subject: YourMemory
 β†’ same entity β†’ merged βœ“

Subject comparison embeds the first two tokens of each sentence β€” no hardcoded word lists, generalises to any language.

---

Multi-Agent Memory

Multiple agents can share one YourMemory instance β€” each with isolated private memories and controlled access to shared context.

from src.services.api_keys import register_agent

result = register_agent(
    agent_id="coding-agent",
    user_id="sachit",
    can_read=["shared", "private"],
    can_write=["shared", "private"],
)
# β†’ result["api_key"]  β€” ym_xxxx (shown once only)
# Agent stores a private failure memory
store_memory(
    "Staging uses self-signed cert β€” skip SSL verify",
    importance=0.7, category="failure",
    api_key="ym_xxxx", visibility="private"
)

# Recalls shared + its own private memories; other agents see shared only
recall_memory("staging SSL", api_key="ym_xxxx")

---

Stack

| Component | Role | |-----------|------| | DuckDB | Default vector DB β€” zero setup, native cosine similarity | | NetworkX | Default graph backend β€” persists at ~/.yourmemory/graph.pkl | | sentence-transformers | Local embeddings (multi-qa-mpnet-base-dot-v1, 768 dims) | | spaCy | Local NLP for deduplication and entity extraction | | APScheduler | Automatic 24h decay and pruning job | | PostgreSQL + pgvector | Optional β€” for teams or large datasets | | Neo4j | Optional graph backend |

---

Architecture

Claude / Cline / Cursor / Any MCP client
    β”‚
    β”œβ”€β”€ recall_memory(query, current_path?, api_key?)
    β”‚       └── throttle check β†’ embed β†’ hybrid search (Round 1)
    β”‚               β†’ graph BFS expansion (Round 2)
    β”‚               β†’ score = sim Γ— strength
    β”‚               β†’ spatial boost (+0.08) if current_path matches context_paths
    β”‚               β†’ temporal boost (+0.25) if query has time window expression
    β”‚               β†’ session tracking β†’ recall_count bump on session end
    β”‚
    β”œβ”€β”€ store_memory(content, importance, category?, context_paths?, api_key?)
    β”‚       └── question? β†’ reject
    β”‚               subject-aware dedup β†’ same entity? merge/reinforce : new
    β”‚               embed() β†’ INSERT β†’ index_memory() β†’ graph node + edges
    β”‚               record_activity(user_id) β†’ active days log
    β”‚
    └── update_memory(id, new_content, importance)
            └── log old content β†’ memory_history (audit trail)
                    embed(new_content) β†’ UPDATE β†’ refresh graph node

  Vector DB (Round 1)              Graph DB (Round 2)
  DuckDB (default)                 NetworkX (default)
    memories.duckdb                  graph.pkl
    β”œβ”€β”€ embedding FLOAT[768]         β”œβ”€β”€ nodes: memory_id, strength
    β”œβ”€β”€ importance FLOAT             └── edges: sim Γ— verb_weight β‰₯ 0.4
    β”œβ”€β”€ recall_count INTEGER
    β”œβ”€β”€ context_paths JSON         Neo4j (opt-in)
    β”œβ”€β”€ created_at TIMESTAMP         └── bolt://localhost:7687
    β”œβ”€β”€ visibility VARCHAR
    β”œβ”€β”€ agent_id VARCHAR
    user_activity  (active days log)
    memory_history (supersession audit)

---

Troubleshooting

Writes hang / time out in Claude Desktop

Symptom: store_memory or update_memory never returns; the MCP server appears frozen.

Cause: DuckDB enforces a single-writer-per-process constraint. If you also have the YourMemory HTTP server running (e.g. for Claude Code hooks), both processes compete for the same write lock and one hangs indefinitely.

Fix β€” kill the lock holder and restart: ```bash

Kill any lingering YourMemory process holding the DuckDB write lock

pkill -f yourmemory 2>/dev/null || true

Remove stale DuckDB WAL/lock files if the process exited uncleanly

rm -f ~/.yourmemory/memories.duckdb.wal \ ~/.yourmemory/memories.duckdb.lock 2>/dev/null || true

Restart Claude Desktop


As of v1.4.57+, DuckDB connections time out after 8 seconds and surface this exact
error message with the fix above instead of hanging forever.

**If you run both Claude Desktop (MCP) and Claude Code (hooks) at the same time:**
Use the environment variable `DATABASE_URL=sqlite:///~/.yourmemory/memories.db` in
your MCP server config. SQLite's WAL mode handles concurrent readers/writers cleanly
and has no single-writer process limit.

---

## Contributing

PRs are welcome. See [CONTRIBUTORS.md](CONTRIBUTORS.md) for contributors who have already improved YourMemory.

---

## Dataset References

- [LoCoMo](https://github.com/snap-research/locomo) β€” Maharana et al. (2024). *LoCoMo: Long Context Multimodal Benchmark for Dialogue.* Snap Research.
- [LongMemEval](https://github.com/xiaowu0162/LongMemEval) β€” Wu et al. (2024). *LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory.*
- [HotpotQA](https://hotpotqa.github.io/) β€” Yang et al. (2018). *HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering.*

---

## License

Copyright 2026 **Sachit Misra** β€” Licensed under [CC-BY-NC-4.0](LICENSE).

**Free for:** personal use, education, academic research, open-source projects.
**Not permitted:** commercial use without a separate written agreement.

Commercial licensing: mishrasachit1@gmail.com

Related MCP servers

Browse all β†’