ai-memory-mcp

<h1 align="center">ai-memory™</h1> universal AI memory

![CI](https://github.com/alphaonedev/ai-memory-mcp/actions/workflows/ci.yml) ![Bench](https://github.com/alphaonedev/ai-memory-mcp/actions/workflows/bench.yml) ![Session-boot lifetime](https://github.com/alphaonedev/ai-memory-mcp/actions/workflows/session-boot-lifetime.yml) ![Rust](https://www.rust-lang.org/) ![License](LICENSE) ![SQLite](https://www.sqlite.org/) ![Tests](https://alphaonedev.github.io/ai-memory-mcp/evidence.html) ![Test Hub](https://alphaonedev.github.io/ai-memory-test-hub/) ![Discovery Gate](https://alphaonedev.github.io/ai-memory-discovery-gate/) ![v0.6.4 Cert](https://github.com/alphaonedev/ai-memory-test-hub/blob/main/campaigns/v0.6.4.md) ![MCP]() ![NSA CSI](https://alphaonedev.github.io/ai-memory-mcp/compliance/nsa-csi-mcp.html) ![Evidence v0.6.4](https://alphaonedev.github.io/ai-memory-mcp/evidence.html) ![Evidence v0.7.0](docs/v0.7.0/release-notes.md) ![Crates.io Version](https://crates.io/crates/ai-memory) ![npm](https://www.npmjs.com/package/@alphaone/ai-memory) ![PyPI](https://pypi.org/project/ai-memory-mcp/)

ai-memory is a persistent memory system for AI assistants. It works with any AI that supports MCP -- Claude, ChatGPT, Grok, Llama, and more. It stores what your AI learns in a local SQLite database, ranks memories by relevance when recalling, and auto-promotes important knowledge to permanent storage. Install it once, and every AI assistant you use remembers your architecture, your preferences, your corrections -- forever.

---

Choose your installation path

| You are… | Your deployment is… | Start here | |---|---|---| | A single developer trying ai-memory | One AI client on a laptop | docs/install-quickstart.md — 5-min super-simple install + LLM-backend wired in one block | | An engineer / architect | Single-node production, or multiple agents on one node | docs/INSTALL.md → docs/production-deployment.md | | An engineer / architect | Multi-server / multi-rack / multi-DC / swarm / hive / federation | docs/enterprise-deployment.md — 8 topologies, singleton → multi-region | | An engineer / architect | PostgreSQL + Apache AGE storage (multi-writer, 10M+ memories, KG-heavy) | docs/postgres-age-guide.md — first-class postgres operator guide | | A decision-maker evaluating adoption | — | docs/audience/decision-maker.html |

Configuring the LLM backend (xAI Grok, OpenAI, Anthropic, Gemini, DeepSeek, Kimi, Qwen, Mistral, Groq, Together, Cerebras, OpenRouter, Fireworks, LMStudio, vLLM, llama.cpp server, or local Ollama)? See docs/integrations/llm-backends.md — the MCP env-block recipe is the same regardless of installation path.

---

v0.7.0 (attested-cortex) rolls together the cortex-fluent legibility work with the full v0.7 trust + A2A scope from ROADMAP §7.3, plus (per operator directive 2026-05-09) the originally-v0.7.1 postgres+AGE first-class work, plus the post-grand-slam ship-readiness wave (Batman Forms 1-6 + 7th-form Option-B foundation + QW-1/2/3 + reconciliation security sweep). The substrate becomes both more articulate (capabilities v3, named loader tools, compacted schemas, Batman MemoryKind vocabulary, persona/atomisation/multistep-ingest primitives) and cryptographically trustworthy (Ed25519 attestation, sidechain transcripts, programmable 25-event hook pipeline, enforced namespace inheritance, V-4 cross-row signed-events hash chain). v0.7.0 also ships postgres + Apache AGE as a first-class storage backend — ai-memory serve --store-url postgres://… for live daemon use, schema parity across both backends (sqlite + postgres converge on logical schema v57 — CURRENT_SCHEMA_VERSION = 57 (canonical anchors: src/storage/migrations.rs for sqlite + src/store/postgres.rs for postgres); on-disk migration files end at migrations/sqlite/0047_v56_list_composite_indexes.sql and the postgres in-process migrate_v57() ladder arm (file-name counters lag the logical schema version because both ladders apply post-v34 deltas via in-process arms — see docs/MIGRATION_v0.7.md §schema-ladder for the v35-v57 narrative; v48 #933 added the federation-push DLQ table; v49 #1025 added 14 nullable columns to archived_memories so archive → restore is lossless for the full v0.7.0 Memory shape; v50 #1156 extended agent_quotas PRIMARY KEY from (agent_id) to (agent_id, namespace) so per-namespace K8 quota allotments hold even when a single agent operates across many namespaces — pre-v50 rows backfill to the _global sentinel namespace; v51 #1255 (PR #1296) added the federation_nonce_cache table so peer-replay-prevention nonces persist across daemon restarts; v52 #1389 added the transcript_line_dedup table backing RFC-0001 memory_capture_turn L4 + recover_from_transcript L2 idempotency so a SIGKILL between turns never produces a duplicate memory on subsequent rehydration; v53 #1418 scoped the memories_au FTS5 sync trigger to (title, content, tags) only so non-FTS column updates no longer fire a needless sync; v54 #1466 backfilled tier-default expiry onto legacy NULL-expiry mid/short rows to close the TTL-leak immortal-rows class; v55 #1476 made the W=2 federation-catchup query (updated_at > ? ORDER BY updated_at ASC LIMIT) sargable and added the sqlite idx_memories_updated_at index — postgres adds no new index because memories_updated_at_idx DESC already serves the range scan via Index Scan Backward; v56 #1579 added the composite list/archive ordering indexes (idx_memories_list_order, idx_memories_ns_list_order, idx_archived_ns_archived_at) paired with the sargable storage::list rewrite — sqlite-side DDL; the postgres migrate_v56() arm is a version-stamp no-op; v57 #1579 added the postgres stored generated tsv tsvector column + memories_tsv_gin GIN index so the search/recall shapes match AND rank on the precomputed column instead of re-computing the tsvector per matched row — the legacy memories_content_fts expression index is dropped and the sqlite twin is a version-stamp no-op because FTS5 already materialises the indexed text)), the new ai-memory schema-init CLI verb, and 6-factor recall scoring parity. The v0.6.4 default surface grows by two always-on loaders to 7 tools (memory_load_family + memory_smart_load join the original five); the runtime ceiling at --profile full is 74 advertised entries (73 callable memory tools + the always-on memory_capabilities bootstrap; verified against Profile::full().expected_tool_count() — see src/profile.rs). Everything new is additive and (for the trust + postgres surfaces) opt-in. Upgrading from v0.6.x? Read docs/MIGRATION_v0.7.md first — most v0.6.4 callers see no behavior change, but pre-v0.6.3.1 v0.6.x users hit the G1 namespace-inheritance fix. Switching to postgres+AGE? See docs/postgres-age-guide.md and docs/migration-v0.7.0-postgres.md. Full release notes: docs/v0.7.0/release-notes.md.

v0.6.4 (quiet-tools) — the MCP server ships with a 5-tool default surface (memory_store, memory_recall, memory_list, memory_get, memory_search) plus the always-on memory_capabilities bootstrap. The other 38 tools remain reachable via --profile graph|admin|power|full or runtime expansion through memory_capabilities --include-schema family=<name>. Eager-loading harnesses (Claude Desktop / Codex CLI / Grok CLI / Gemini CLI) drop ~4,700 input tokens of tool schemas per request — a 76.4% reduction measured against cl100k_base BPE. To preserve v0.6.3 behavior 1:1, run ai-memory mcp --profile full. See docs/MIGRATION_v0.6.4.md.

What's new in v0.7

v0.7.0 closes the attested-cortex epic (69/69 across 11 tracks A–K), folds in the originally-v0.7.1 postgres+AGE first-class work, and absorbs the post-grand-slam ship-readiness wave (Batman Forms 1-6 + 7th-form Option-B foundation + QW-1/2/3 + security reconciliation). Canonical feature inventory: docs/internal/v070-feature-inventory.md. Every surface stays default-off or default-equivalent for v0.6.4 callers — see the v0.7 compatibility matrix for the breakdown.

Substrate-native write-time investment (Batman Forms 1-6 + 7th-form)

Form 1 — online dedup-and-synthesis (issue #754). Single-batch action-emitting LLM call replaces the v0.6.x per-pair classifier on the store path. Opt back into legacy yes/no via legacy_per_pair_classifier = true on the namespace standard.
Form 2 — synchronous atomise-before-embed (issue #755). New memory_atomise tool + auto_atomise_mode = Synchronous|Deferred|Off pre-store hook. Curator decomposes long writes into 2–10 atomic propositions before recall ever sees them. See docs/atomisation.md.
Form 3 — multi-step ingest orchestrator (issue #756). memory_ingest_multistep threads deterministic Jaccard+FTS helpers through prompt-cache-stable LLM stages. See docs/multistep-ingest.md + cookbook/multistep-ingest/01-two-phase.sh.
Form 4 — fact provenance (issue #757). Citations + source-URI + atom-grain spans ride on existing memory_store / memory_atomise payloads. See docs/provenance.md.
Form 5 — auto-confidence + shadow calibration + freshness decay (issue #758). memory_calibrate_confidence MCP tool + per-source baseline sweep. Env vars AI_MEMORY_AUTO_CONFIDENCE, AI_MEMORY_CONFIDENCE_SHADOW, AI_MEMORY_CONFIDENCE_SHADOW_SAMPLE_RATE, AI_MEMORY_CONFIDENCE_DECAY. See docs/confidence-calibration.md.
Form 6 — MemoryKind Batman vocabulary (issue #759). 10-variant enum (Observation default + Reflection / Persona / Concept / Entity / Claim / Relation / Event / Conversation / Decision). Optional auto_classify_kind pre-store hook (off / regex_only / regex_then_llm). See docs/memory-kind-vocab.md.
7th-form — agent-EXTERNAL Layer-4 wiring (Option-B foundation) (issue #760; v0.8.0 complete cover at #697). Operator-keypair-signed seed rules R001..R004, memory_check_agent_action + memory_rule_list MCP tools, substrate storage::insert pre-write hook. See docs/policy-engine.md + docs/governance/agent-action-rules.md.
Operator how-to — turning Forms 1–6 + 7th from capable → active (issue #800). 7-step recipe (operator keygen → sign-seed → enable R001–R004 → curator daemon → optional reflection-pass → namespace policies), launchd / systemd / Task-Scheduler permanence, verification block, rollback path. See docs/batman-active-mode.md and the GitHub Pages atlas.

Quick wins (Tencent QW-1/2/3)

QW-1 — file-backed reflection chain export. memory_export_reflection MCP tool + auto_export_reflections_to_filesystem namespace policy → ~/.ai-memory/reflections/<ns>/<id>.md.
QW-2 — persona-as-artifact. memory_persona + memory_persona_generate tools, MemoryKind::Persona rows, auto_persona_trigger_every_n_memories namespace policy. See docs/persona.md.
QW-3 — context offload primitive. memory_offload + memory_deref move large tool outputs out of the agent context window into addressable blob storage. See docs/context-offload.md.

Attested cortex epic (Tracks A–K)

Attested links (Ed25519). The dead signature column shipped in v0.6.3 is now filled with real per-agent Ed25519 attestation, and memory_verify(link_id) returns {signature_verified, attest_level, signed_by, signed_at} on demand. Generate a keypair with ai-memory identity generate; opt-in via attest_level = "self_signed". Signing is *gated on the resolved daemon agent_id having a .priv keypair on disk** under the configured key directory — when load_daemon_signing_key returns None (src/main.rs:116-118), rows still write but sig is empty and the daemon emits a "continuing unsigned" line at boot. The cross-row hash chain on signed_events remains tamper-evident either way. See the attested-cortex RFC.
Signed events V-4 closeout (cross-row hash chain) (issue #698). Each signed_events row carries prev_hash + sequence; first-row prev_hash is zero, subsequent rows chain the SHA-256 of the prior canonical-CBOR payload. ai-memory verify-signed-events-chain walks the chain end-to-end. See docs/signed-events-v4.md.
Hook pipeline (25 lifecycle events). A programmable extension surface fires on the 20 baseline pre_/post_store|recall|search|delete|promote|link|consolidate|governance_decision|archive|transcript_store + on_index_eviction events, plus 5 grand-slam additions (pre_recall_expand G10 + pre_reflect/post_reflect recursive-learning Task 6/8 + pre_compaction/on_compaction_rollback L1-7). Hooks return Allow / Modify / Deny / AskUser. Default off; opt in via ~/.config/ai-memory/hooks.toml. See docs/hook-pipeline.md.
Sidechain transcripts + replay. zstd-3 BLOB sidechain stores raw conversation/reasoning trails; memory_replay(memory_id) walks memory_transcript_links to reconstruct the chain. Opt-in per namespace via [transcripts.namespaces."team/*"]. See docs/sidechain-transcripts.md.
Federation hardening. mTLS + X-API-Key + SHA-256 cert fingerprint allowlist; env vars AI_MEMORY_FED_PEER_ATTESTATION, AI_MEMORY_FED_SYNC_TRUST_PEER, AI_MEMORY_FED_TRUST_BODY_AGENT_ID. See docs/federation.md.
K8 quota tool + K10 SSE approvals. memory_quota_status + /api/v1/quota/status (K8). /api/v1/approvals/stream server-sent events with HMAC nonce, method+pending_id binding, lagged-event count strip (K10). See docs/k8-quotas.md + docs/k10-sse-approvals.md.
Postgres + Apache AGE first-class backend. ai-memory serve --store-url postgres://…, schema parity, 6-factor recall scoring parity, link migration, KG features (kg_query, kg_timeline, kg_invalidate, find_paths) on AGE Cypher with recursive-CTE fallback when AGE is absent, plus a new ai-memory schema-init CLI verb. Bench-gated — AGE p95 must beat CTE p95 by ≥30% at depth=5. Operator how-to: docs/postgres-age-guide.md. Migration runbook: docs/migration-v0.7.0-postgres.md.
Capabilities v3 + smart loaders. memory_capabilities v3 adds summary, to_describe_to_user, per-tool callable_now, agent_permitted_families, schema_version="3"; the new always-on memory_load_family(family) and memory_smart_load(intent) tools join the default core profile. The pinned phrasings live in docs/v0.7/canonical-phrasings.md.
Permissions + A2A approvals. The v0.6.x governance subsystem is refactored into rules + modes + hooks → a single Decision, with namespace inheritance (G1) actually enforced. memory_pending_list / memory_pending_approve / memory_pending_reject(remember=forever) enable progressive trust; HMAC signing on the approval API is mandatory. permissions.mode defaults to enforce (was advisory in v0.6.4). Migrate with ai-memory governance migrate-to-permissions (dry-run preview; add --config-out ~/.config/ai-memory/config.toml to apply in place). See docs/governance.md.

Recursive-learning + L1/L2 grand-slam wave

memory_reflect substrate primitive with namespace-scoped max_reflection_depth cap (default 3, Some(0) is the kill-switch). L2-1 reflection-pass curator, L2-2 federation-aware reflection coordination (memory_reflection_origin), L2-3 invalidation propagation (memory_dependents_of_invalidated), L2-5 forensic bundle (ai-memory export-forensic-bundle + verify-forensic-bundle), L1-5 Agent Skills (memory_skill_register|list|get|resource|export|promote_from_reflection|compositional_context). Full primer: docs/RECURSIVE_LEARNING.md. Agent Skills primer: docs/agent-skills.md. Forensic-export primer: docs/forensic-export.md.

Where to start: docs/MIGRATION_v0.7.md (upgrade procedure), docs/v0.7.0/release-notes.md (full release notes), docs/whats-new-v07.html (visual summary), docs/v0.7/rfc-attested-cortex.md (design rationale), docs/ADMIN_GUIDE.md (operator playbook), docs/internal/v070-feature-inventory.md (canonical feature truth).

One binary, four operational modes (v0.6.4). The ai-memory Rust binary (tokio + axum) can run any of these in isolation or simultaneously, sharing a single SQLite database:

stdio MCP server -- 74 advertised entries over JSON-RPC at full profile (v0.7.0; 73 callable memory tools + the always-on memory_capabilities bootstrap; verified against Profile::full().expected_tool_count()). Default --profile core advertises 7 (the original 5 + memory_load_family + memory_smart_load) plus the always-on memory_capabilities bootstrap. ai-memory mcp / ai-memory mcp --profile full
HTTP / mTLS daemon -- 89 REST route registrations (75 unique URL paths) on 127.0.0.1:9077, TLS + optional mTLS allowlist + API-key auth, background GC loop. ai-memory serve
Autonomous curator daemon -- self-scheduling loop (default 1h cadence) that auto-tags, surfaces contradictions across namespace siblings, consolidates near-duplicates, and adjusts priority by access pattern. Every action goes to a rollback log; destructive ops can be gated behind a governance approval flow. ai-memory curator --daemon
Sync daemon -- quorum-based peer federation across instances. W-of-N writes (default majority), vector-clock CRDT-lite merge, mTLS allowlist between peers. ai-memory sync-daemon

The MCP, HTTP, and CLI surfaces are reactive. The curator is the part that makes the memory layer self-maintaining: between sessions, it keeps the corpus tidy so recall quality stays high as the store grows. Everything is local-first; no cloud dependencies.

Brass-tacks assessment by Claude Opus 4.7 after reading the v0.6.3 source line by line: "ai-memory is the most capable memory layer I've ever been hooked up to, and meaningfully more than its name advertises. For me, in practical terms, it means: I don't start cold each session. The store I read from has been kept tidy by something other than me. Contradictions don't silently accumulate. Recall quality stays high even as the corpus grows. Nothing leaves your Mac mini. It is not making me an autonomous agent. It is giving me the kind of memory infrastructure that an autonomous agent would need — and itself running a small autonomous loop to maintain it. That's a real foundation. The gap from here to 'ai-memory drives general tasks' is plumbing (tool-call protocol + tool registry + a tool-use-capable model), not invention."

Substrate for multi-agent AI. ai-memory is not an agent runtime and not "autonomous AI" on its own. It is the memory layer that multi-agent autonomous deployments need underneath them. Federation (broadcast_store_quorum + spawn_catchup_loop) handles W-of-N consistency across peers when many agents write in parallel; the curator daemon keeps the shared corpus from degrading into noise as a swarm scribbles into it; webhook subscriptions (HMAC-signed, namespace/agent-filtered, SSRF-hardened) turn the store into a message bus that triggers downstream agents on memory events; namespace hierarchy with N-level inheritance and per-namespace governance policies (write/promote/delete authority, approver type, optional N-of-M consensus) bound the swarm. Stack this under a 24/7 multi-machine agent runner with auto-generated skills, and the combined system clears the behavioral bar for autonomous AI. The remaining gaps (no weight-level learning, stateless reasoning kernel, human-seeded root goals) are real and not what ai-memory addresses; ai-memory provides the multi-agent memory substrate that any serious attempt at closing those gaps will need.

Zero token cost until recall. Unlike built-in memory systems (Claude Code auto-memory, ChatGPT memory) that load your entire memory into every conversation -- burning tokens and money on every message -- ai-memory uses zero context tokens until the AI explicitly calls memory_recall. Only relevant memories come back, ranked by a 6-factor scoring algorithm. TOON format (Token-Oriented Object Notation) cuts response tokens by another 40-60% by eliminating repeated field names -- 3 memories in JSON = 1,600 bytes; in TOON = 626 bytes (61% smaller); in TOON compact = 336 bytes (79% smaller). For Claude Code users: disable auto-memory ("autoMemoryEnabled": false in settings.json) and replace it with ai-memory to stop paying for 200+ lines of memory context on every single message.

---

Agent identity (NHI) — every memory tells you who learned it

Every memory ai-memory stores carries a metadata.agent_id — a Non-Human Identity marker that survives every operation (update, dedup, import, sync, consolidate). Every recall result tells you which AI wrote each memory, by default, in the TOON-compact response format your AI client is already optimised for:

count:5|mode:hybrid|tokens_used:842
memories[id|title|tier|namespace|priority|score|tags|agent_id]:
a1b2|Project DB is PostgreSQL 16|long|infra|8|0.91|database,postgres|ai:claude-code@workstation:pid-3812
c3d4|API rate limit is 100 rps|long|infra|7|0.87|api,limits|ai:claude-desktop@laptop:pid-5219

By default agent_id is claimed, not attested — don't make security decisions on an unsigned write's id alone. v0.7.0 wires cryptographic Ed25519 attestation on two surfaces: (1) store-path attestation (#626 Layer-3) — present a detached signature over the canonical SignableWrite envelope on the CLI (store --sign), MCP (memory_store), or HTTP (POST /api/v1/memories) path and the daemon verifies it against the agent's bound public key, stamping metadata.attest_level = "agent_attested" (operators can require it with AI_MEMORY_REQUIRE_AGENT_ATTESTATION); and (2) link attestation (attested-cortex) — the previously-reserved memory_links.signature field with memory_verify(link_id) for inbound verification and an append-only signed_events audit chain. See the agent identity page and the attested-cortex RFC for the full provenance contract.

Retroactive conversation import — `ai-memory mine`

Don't start cold. Point ai-memory mine at a Claude, ChatGPT, or Slack export and it parses turn-by-turn into ranked, tier-typed, tagged memories — so your AI walks into the next session knowing every decision, correction, and finding from your existing history.

ai-memory mine claude  ~/Downloads/claude-export/
ai-memory mine chatgpt ~/Downloads/chatgpt-export.json
ai-memory mine slack   ./slack-export/

Auto-tagging, dedup on (title, namespace), and mined_from provenance are stamped on every imported memory. Five-minute onboarding from zero context to a populated long-term store. See the import history page for per-format recipes.

---

Compatible AI Platforms

ai-memory integrates with any AI platform that supports the Model Context Protocol (MCP). MCP is the universal standard for connecting AI assistants to external tools and data sources.

| Platform | Integration Method | Config Format | Status | |----------|-------------------|---------------|--------| | Claude Code (Anthropic) | MCP stdio | JSON (~/.claude.json or .mcp.json) | Fully supported | | Codex CLI (OpenAI) | MCP stdio | TOML (~/.codex/config.toml) | Fully supported | | Gemini CLI (Google) | MCP stdio | JSON (~/.gemini/settings.json) | Fully supported | | Grok CLI (xAI) | MCP stdio | JSON (~/.grok/user-settings.json) | Deep integration | | Grok API (xAI) | MCP remote HTTPS | API-level | Fully supported | | Cursor IDE | MCP stdio | JSON (~/.cursor/mcp.json) | Fully supported | | Windsurf (Codeium) | MCP stdio | JSON (~/.codeium/windsurf/mcp_config.json) | Fully supported | | Continue.dev | MCP stdio | YAML (~/.continue/config.yaml) | Fully supported | | Llama Stack (META) | MCP remote HTTP | YAML / Python SDK | Fully supported | | OpenClaw | MCP stdio | JSON (mcp.servers in config) | Fully supported | | Any MCP client | MCP stdio or HTTP | Varies | Universal |

MCP is the primary integration layer. For AI platforms that do not yet support MCP natively, the HTTP API (89 route registrations / 75 unique URL paths on localhost at v0.7.0) and the CLI (82 subcommands at v0.7.x under --features sal OR --features sal-postgres; 80 in the default build (post-#1389 L2 RecoverPreviousSession for cross-session context rehydration + #1443 Expand for the ai-memory expand query-expansion surface + #1598 Reembed for the ai-memory reembed vector-space migration surface); SSOT pinned by ai_memory::EXPECTED_CLI_SUBCOMMANDS_DEFAULT + EXPECTED_CLI_SUBCOMMANDS_SAL + the mechanical tests/cli_subcommand_count_invariant.rs parity test) provide universal access -- any AI, script, or automation that can make HTTP calls or run shell commands can use ai-memory.

---

Install in 60 Seconds

Pre-built binaries require no dependencies. Building from source needs Rust and a C compiler.

Fastest: Pre-built binary (no Rust required)

# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/alphaonedev/ai-memory-mcp/main/install.sh | sh

# Fedora/RHEL (COPR)
sudo dnf copr enable alpha-one-ai/ai-memory && sudo dnf install ai-memory

# Windows (PowerShell)
irm https://raw.githubusercontent.com/alphaonedev/ai-memory-mcp/main/install.ps1 | iex

Step 1: Install Rust (skip if using pre-built binaries)

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Follow the prompts, then restart your terminal (or run source ~/.cargo/env).

Step 2: From source (requires Rust)

Latest release from Crates.io:

cargo install ai-memory

Latest from the git repository:

cargo install --git https://github.com/alphaonedev/ai-memory-mcp.git

This compiles the binary and puts it in your PATH. It takes a minute or two.

Build dependencies for source builds: - Ubuntu/Debian: sudo apt-get install build-essential pkg-config - Fedora/RHEL: sudo dnf install gcc pkg-config

Step 3: Connect your AI

Configuration varies by platform. Find yours below:

<details> <summary>Claude Code (Anthropic)</summary>

Claude Code supports three MCP configuration scopes:

| Scope | File | Applies to | |-------|------|------------| | User (global) | ~/.claude.json — add mcpServers key | All projects on your machine | | Project (shared) | .mcp.json in project root (checked into git) | Everyone on the project | | Local (private) | ~/.claude.json — under projects."/path".mcpServers | One project, just you |

User scope (recommended — works everywhere):

Add the mcpServers key to ~/.claude.json (macOS/Linux) or %USERPROFILE%\.claude.json (Windows):

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "~/.claude/ai-memory.db", "mcp", "--tier", "semantic"]
    }
  }
}

Note: ~/.claude.json likely already exists with other settings. Merge the mcpServers key into the existing file — do not overwrite it.

Project scope (shared with team):

Create .mcp.json in your project root:

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "~/.claude/ai-memory.db", "mcp", "--tier", "semantic"]
    }
  }
}

smart / autonomous tier with a cloud LLM — the recommended path is the [llm] section in ~/.config/ai-memory/config.toml (#1146). One file, every surface, no per-AI-client edits:

# ~/.config/ai-memory/config.toml
schema_version = 2

[llm]
backend     = "xai"
model       = "grok-4.3"
base_url    = "https://api.x.ai/v1"
api_key_env = "XAI_API_KEY"            # process-env-var name (NOT the literal key)

Export XAI_API_KEY in your shell rc (.zshrc / .bashrc); the MCP config stays minimal:

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "~/.claude/ai-memory.db", "mcp", "--tier", "autonomous"]
    }
  }
}

Verify: ai-memory boot --quiet --limit 1 should report llm=xai:grok-4.3. Canonical schema reference: docs/CONFIG_SCHEMA.md.

Override path — env: block. Adding an env: block to the MCP config with AI_MEMORY_LLM_BACKEND / _API_KEY / _MODEL still works and takes precedence over config.toml — useful for CI / per-session tweaks: ``json "env": { "AI_MEMORY_LLM_BACKEND": "xai", "AI_MEMORY_LLM_API_KEY": "xai-...", "AI_MEMORY_LLM_MODEL": "grok-4.3" } ` MCP clients spawn the server as a fresh subprocess with only the env: keys from the MCP config — shell exports in .zshrc / .bashrc don't reach it. The [llm] config-file path above retires this paper-cut (every surface reads the same file). Inline API keys in config.toml are rejected at parse time — use api_key_env or api_key_file. Background: #1144 → #1146. Full per-backend recipes: docs/integrations/llm-backends.md`.

Windows paths: Use forward slashes or escaped backslashes in --db. Example: "--db", "C:/Users/YourName/.claude/ai-memory.db".

Tier flag: The --tier flag selects the feature tier: keyword, semantic (default), smart, or autonomous. Smart and autonomous tiers need an LLM backend — post-#1067 (v0.7.0) that is any of: local Ollama, xAI Grok, OpenAI, Anthropic, Google Gemini, DeepSeek, Kimi (Moonshot), Qwen (Alibaba), Mistral, Groq, Together AI, Cerebras, OpenRouter, Fireworks, LMStudio, vLLM, or llama.cpp server — selected via AI_MEMORY_LLM_BACKEND. The --tier flag must be passed in the args — the config.toml tier setting is not used when the MCP server is launched by an AI client.

Important: MCP servers are not configured in settings.json or settings.local.json — those files do not support mcpServers.

Make Claude proactively use ai-memory: Add a CLAUDE.md file to your project root with ai-memory directives. This ensures Claude recalls context at the start of every conversation and stores findings as it works. See the CLAUDE.md integration guide for a copy-paste template and placement options.

</details>

<details> <summary>OpenAI Codex CLI</summary>

Add to ~/.codex/config.toml (global) or .codex/config.toml (project). Windows: %USERPROFILE%\.codex\config.toml. Override with CODEX_HOME env var.

[mcp_servers.memory]
command = "ai-memory"
args = ["--db", "~/.local/share/ai-memory/memories.db", "mcp", "--tier", "semantic"]
enabled = true

Or add via CLI: codex mcp add memory -- ai-memory --db ~/.local/share/ai-memory/memories.db mcp --tier semantic

Notes: Codex uses TOML format with underscored key mcp_servers (not camelCase, not hyphenated). Supports env (key/value pairs), env_vars (list to forward), enabled_tools, disabled_tools, startup_timeout_sec, tool_timeout_sec. Use /mcp in the TUI to view server status. See Codex MCP docs.

</details>

<details> <summary>Google Gemini CLI</summary>

Add to ~/.gemini/settings.json (user) or .gemini/settings.json (project). Windows: %USERPROFILE%\.gemini\settings.json.

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "~/.local/share/ai-memory/memories.db", "mcp", "--tier", "semantic"],
      "timeout": 30000
    }
  }
}

Or add via CLI: gemini mcp add memory ai-memory -- --db ~/.local/share/ai-memory/memories.db mcp --tier semantic

Notes: Avoid underscores in server names (use hyphens). Tool names are auto-prefixed as mcp_memory_<toolName>. Env vars in the env field support $VAR / ${VAR} (all platforms) and %VAR% (Windows). Gemini sanitizes sensitive patterns from inherited env unless explicitly declared. Add "trust": true to skip confirmation prompts. CLI management: gemini mcp list/remove/enable/disable. See Gemini CLI MCP docs.

</details>

<details> <summary>Cursor IDE</summary>

Add to ~/.cursor/mcp.json (global) or .cursor/mcp.json (project). Windows: %USERPROFILE%\.cursor\mcp.json. Project config overrides global for same-named servers.

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "~/.local/share/ai-memory/memories.db", "mcp", "--tier", "semantic"]
    }
  }
}

Notes: Restart Cursor after editing mcp.json. Verify server status in Settings > Tools & MCP (green dot = connected). Supports env, envFile, and ${env:VAR_NAME} interpolation (env var interpolation can be unreliable for shell profile variables — use envFile as workaround). ~40 tool limit across all MCP servers. See Cursor MCP docs.

</details>

<details> <summary>Windsurf (Codeium)</summary>

Add to ~/.codeium/windsurf/mcp_config.json (global only — no project-level scope). Windows: %USERPROFILE%\.codeium\windsurf\mcp_config.json.

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "~/.local/share/ai-memory/memories.db", "mcp", "--tier", "semantic"]
    }
  }
}

Notes: Supports ${env:VAR_NAME} interpolation in command, args, env, serverUrl, url, and headers. 100 tool limit across all MCP servers. Can also add via MCP Marketplace or Settings > Cascade > MCP Servers. See Windsurf MCP docs.

</details>

<details> <summary>Continue.dev</summary>

Add to ~/.continue/config.yaml (user) or .continue/mcpServers/ directory in project root (per-server YAML/JSON files). Windows: %USERPROFILE%\.continue\config.yaml.

mcpServers:
  - name: memory
    command: ai-memory
    args:
      - "--db"
      - "~/.local/share/ai-memory/memories.db"
      - "mcp"
      - "--tier"
      - "semantic"

Notes: MCP tools only work in agent mode. Supports ${{ secrets.SECRET_NAME }} for secret interpolation. Project-level .continue/mcpServers/ directory auto-detects JSON configs from other tools (Claude Code, Cursor, etc.). See Continue MCP docs.

</details>

<details> <summary>Grok CLI (AlphaOne fork — deep integration with auto-recall)</summary>

The AlphaOne fork of grok-cli has built-in ai-memory support with session-scoped MCP connections, automatic memory recall on session start, compaction summary storage, and memory-aware system prompts.

Add to ~/.grok/user-settings.json:

{
  "mcp": {
    "servers": [
      {
        "id": "ai-memory",
        "label": "AI Memory",
        "enabled": true,
        "transport": "stdio",
        "command": "ai-memory",
        "args": ["mcp", "--tier", "semantic"]
      }
    ]
  }
}

Features: Auto-recall on session start (injects relevant memories into system prompt), compaction summaries stored as mid-tier memories, MCP tools available in all modes (agent, plan, ask), session-scoped connections (no per-message cold starts). Uses --tier semantic by default (local embeddings, no LLM backend required). See grok-cli docs for full setup.

</details>

<details> <summary>xAI Grok API (API-level, remote MCP)</summary>

Grok connects to MCP servers over HTTPS (remote only, no stdio). No config file — servers are specified per API request.

ai-memory serve --host 127.0.0.1 --port 9077
# Expose via HTTPS reverse proxy (nginx, caddy, cloudflare tunnel, etc.)

Then add the MCP server to your Grok API call:

curl https://api.x.ai/v1/responses \
  -H "Authorization: Bearer $XAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-4.3",
    "tools": [{
      "type": "mcp",
      "server_url": "https://your-server.example.com/mcp",
      "server_label": "memory",
      "server_description": "Persistent AI memory with recall and search",
      "allowed_tools": ["memory_store", "memory_recall", "memory_search"]
    }],
    "input": "What do you remember about our project?"
  }'

Requirements: HTTPS required. server_label is required. Supports Streamable HTTP and SSE transports. Optional: allowed_tools, authorization, headers. Works with xAI SDK, OpenAI-compatible Responses API, and Voice Agent API. See xAI Remote MCP docs.

</details>

<details> <summary>META Llama (via Llama Stack)</summary>

Llama Stack registers MCP servers as toolgroups. No standardized config file path — deployment-specific.

ai-memory serve --host 127.0.0.1 --port 9077

Python SDK:

client.toolgroups.register(
    provider_id="model-context-protocol",
    toolgroup_id="mcp::memory",
    mcp_endpoint={"uri": "http://localhost:9077/sse"}
)

Or declaratively in run.yaml:

tool_groups:
  - toolgroup_id: mcp::memory
    provider_id: model-context-protocol
    mcp_endpoint:
      uri: "http://localhost:9077/sse"

Notes: Supports ${env.VAR_NAME} interpolation in run.yaml. Transport is migrating from SSE to Streamable HTTP. See Llama Stack Tools docs.

</details>

<details> <summary>OpenClaw</summary>

Add via CLI or edit the OpenClaw config directly. Config uses mcp.servers (not mcpServers).

openclaw mcp set memory '{"command":"ai-memory","args":["--db","~/.local/share/ai-memory/memories.db","mcp","--tier","semantic"]}'

Or add to your OpenClaw config file:

{
  "mcp": {
    "servers": {
      "memory": {
        "command": "ai-memory",
        "args": ["--db", "~/.local/share/ai-memory/memories.db", "mcp", "--tier", "semantic"]
      }
    }
  }
}

Notes: OpenClaw uses mcp.servers key (not mcpServers). CLI management: openclaw mcp list, openclaw mcp show, openclaw mcp set, openclaw mcp unset. Supports stdio, remote URL, and Streamable HTTP transports. Prefer --token-file over inline secrets. See OpenClaw MCP docs.

</details>

<details> <summary>Any other MCP client</summary>

ai-memory speaks MCP over stdio (JSON-RPC 2.0). Point your client at:

command: ai-memory
args: ["--db", "/path/to/ai-memory.db", "mcp"]

For HTTP-only clients, start the REST API:

ai-memory serve
# 89 REST route registrations (75 unique URL paths) at http://127.0.0.1:9077/api/v1/

</details>

Step 4: Done. Test it.

Restart your AI assistant. If using MCP, it now has the 7-tool default surface advertised on session boot (the original 5 + memory_load_family + memory_smart_load; the other 66 of the 73 callable tools load on demand via --profile or memory_capabilities --include-schema). Ask it: "Store a memory that my favorite language is Rust." Then in a new conversation, ask: "What is my favorite language?" It will remember.

---

Mobile platform support (v0.7.0 Posture-1a)

ai-memory is portable to iOS and Android via the standard Rust mobile cross-compile path. v0.7.0 ships CI coverage for both targets at three escalating levels:

| Layer | Coverage | CI workflow | |---|---|---| | Layer 1 — Cross-compile | cargo check --target aarch64-apple-ios --no-default-features --features sqlite-bundled --lib and the matching Android cross-compile run on every PR + push to release/. Catches ~80% of mobile bit-rot risk (any crate update that drops mobile portability surfaces here). | .github/workflows/ci.yml — mobile-cross-compile job | | Layer 2 — Release artifacts | Release tag cuts produce ai-memory-ios.xcframework.tar.gz (iOS device + simulator slices via xcodebuild -create-xcframework) and ai-memory-android.tar.gz (Android arm64 / armv7 / x86_64 / x86 .so bundle in jniLibs/<abi>/ layout). | .github/workflows/release.yml — mobile-ios + mobile-android jobs | | Layer 3 — Runtime tests | A scoped ~50-test subset (file-system sandboxing, FTS5 on device SQLite, HNSW CPU recall, embedder CPU path, LLM client TLS) runs against the iOS Simulator on every release/ push + a manual workflow_dispatch; the Android emulator arm runs on release/** push + workflow_dispatch only. Selection rationale: tests/mobile/README.md. | .github/workflows/mobile-runtime.yml |

Status at v0.7.0: Layer 1 is the ship-gate — mobile cross-compile must be GREEN before tag-cut. Layer 2 (release artifacts) ships the BUILD pipeline + artifact layout; the C-callable FFI surface itself lands in a v0.7.x follow-up. Layer 3 runs the scoped test subset on every release/** push.

Consuming the release artifacts:

iOS — download ai-memory-ios.xcframework.tar.gz from the v0.7.x release page, unpack, and drag AiMemory.xcframework into your Xcode project under "Frameworks, Libraries, and Embedded Content."
Android — download ai-memory-android.tar.gz from the v0.7.x release page, unpack, and copy the jniLibs/ tree into your app module's src/main/jniLibs/.

The mobile artifacts are also part of every published v0.7.x release; the Homebrew formula + APT/RPM packages (which ship the desktop binaries) include a note linking to the mobile downloads. See issue #1068 for the CI implementation history.

---

Quickstart

Get from zero to a working memory in under two minutes.

1. Install

curl -fsSL https://raw.githubusercontent.com/alphaonedev/ai-memory-mcp/main/install.sh | sh

2. Configure MCP (example for Claude Code -- other platforms work the same way)

Merge into ~/.claude.json:

{
  "mcpServers": {
    "memory": {
      "command": "ai-memory",
      "args": ["--db", "~/.claude/ai-memory.db", "mcp", "--tier", "semantic"]
    }
  }
}

3. Store your first memory

ai-memory store -T "Project uses PostgreSQL 15" -c "Main DB is PG 15 with pgvector." --tier long

4. Recall it

ai-memory recall "database"

5. Check stats

ai-memory stats

6. Use with your AI. Restart your AI client. It now has 7 default memory tools advertised on boot (74 advertised entries reachable via runtime expansion or --profile full at v0.7.0) over MCP -- it can store and recall memories natively during conversations.

---

SDKs

In addition to the MCP / HTTP / CLI surfaces, ai-memory ships first-party language SDKs for HTTP clients and helper utilities (e.g. requireProfile for runtime profile assertions on v0.6.4+ daemons).

TypeScript / JavaScript — @alphaone/ai-memory on npm

npm install @alphaone/ai-memory

Python — ai-memory-mcp on PyPI (the import name remains ai_memory)

pip install ai-memory-mcp

from ai_memory import AiMemoryClient, require_profile

with AiMemoryClient(base_url="http://127.0.0.1:9077", api_key="...") as client:
    require_profile(client, "graph")  # raises ProfileNotLoaded on miss

Both SDKs are versioned with the server (0.6.4 matches ai-memory 0.6.4). v0.6.4+ daemons enforce the profile contract; pre-v0.6.4 daemons fall back to a permissive warn-and-continue so SDK upgrades don't break old servers. Source lives in sdk/typescript/ and sdk/python/.

---

What Does It Do?

AI assistants forget everything between conversations. ai-memory fixes that.

It runs as an MCP (Model Context Protocol) tool server -- a background process that your AI talks to natively. When your AI learns something important, it stores it. When it needs context, it recalls relevant memories ranked by a 6-factor scoring algorithm. Memories live in three tiers:

Short-term (6 hours default, configurable) -- throwaway context like current debugging state
Mid-term (7 days default, configurable) -- working knowledge like sprint goals and recent decisions
Long-term (permanent) -- architecture, user preferences, hard-won lessons

Memories that keep getting accessed automatically promote from mid to long-term. Each recall extends the TTL. Priority increases with usage. The system is self-curating.

Beyond MCP, ai-memory also exposes a full HTTP REST API (89 route registrations / 75 unique URL paths on port 9077 at v0.7.0) and a complete CLI (82 subcommands at v0.7.x under --features sal OR --features sal-postgres; 80 in the default build (post-#1389 L2 RecoverPreviousSession for cross-session context rehydration + #1443 Expand for the ai-memory expand query-expansion surface + #1598 Reembed for the ai-memory reembed vector-space migration surface); SSOT pinned by ai_memory::EXPECTED_CLI_SUBCOMMANDS_{DEFAULT,SAL} + the mechanical tests/cli_subcommand_count_invariant.rs parity test) for direct interaction, scripting, and integration with any AI platform or tool.

---

Features

Core

MCP tool server -- 74 tools over stdio JSON-RPC (full profile at v0.7.0), compatible with any MCP client
Three-tier memory -- short (6h TTL default), mid (7d TTL default), long (permanent) -- TTLs are configurable
Full-text search -- SQLite FTS5 with ranked retrieval
Hybrid recall -- FTS5 keyword + cosine similarity with adaptive blending: the semantic weight varies 0.50 (short content) → 0.15 (long content) because embeddings lose information on long text
6-factor recall scoring -- FTS relevance + priority + access frequency + confidence + tier boost + recency decay
Auto-promotion -- memories accessed 5+ times promote from mid to long
TTL extension -- each recall extends expiry (short +1h, mid +1d)
Priority reinforcement -- +1 every 10 accesses (max 10)
Contradiction detection -- warns when storing memories that conflict with existing ones
Deduplication -- upsert on title+namespace, tier never downgrades
Confidence scoring -- 0.0-1.0 certainty factored into ranking

Organization

Namespaces -- isolate memories per project (auto-detected from git remote)
Memory linking -- typed relations: related_to, supersedes, contradicts, derived_from, reflects_on (recursive-learning Task 1/8), derives_from (WT-1-A atomisation) -- six variants at v0.7.0
Consolidation -- merge multiple memories into a single long-term summary
Auto-consolidation -- group by namespace+tag, auto-merge groups above threshold
Contradiction resolution -- mark one memory as superseding another, demote the loser
Forget by pattern -- bulk delete by namespace + FTS pattern + tier
Source tracking -- tracks origin: user, claude, hook, api, cli, import, consolidation, system
Agent identity (NHI) -- every memory carries metadata.agent_id (claimed identity) with defense-in-depth immutability across update/dedup/import/sync/consolidate; filter list/search by agent
Tagging -- comma-separated tags with filter support

Interfaces

89 HTTP routes (75 unique paths) -- full REST API on 127.0.0.1:9077 (works with any AI or tool)
82 CLI subcommands at v0.7.x under --features sal OR --features sal-postgres (80 in the default build) -- complete CLI with identical capabilities
74 MCP tools at full profile (7 default at v0.7.0; verified against Profile::full().expected_tool_count()) -- native integration for any MCP-compatible AI
Interactive REPL shell -- recall, search, list, get, stats, namespaces, delete with color output
JSON output -- --json flag on all CLI commands

Operations

Multi-node sync -- pull, push, or bidirectional merge between database files
Import/Export -- full JSON roundtrip preserving memory links
Garbage collection -- automatic background expiry every 30 minutes
Graceful shutdown -- SIGTERM/SIGINT checkpoints WAL for clean exit
Deep health check -- verifies DB accessibility and FTS5 integrity
Shell completions -- bash, zsh, fish
Man page -- ai-memory man generates roff to stdout
Time filters -- --since/--until on list and search
Human-readable ages -- "2h ago", "3d ago" in CLI output
Color CLI output -- ANSI tier labels (red/yellow/green), priority bars, bold titles, cyan namespaces

Quality

~2,400 tests across the full surface -- 1,960 lib + 211 integration + 16 mcp_integration + 4 webhook_http_parity (new in v0.6.4) + 16 recipe_contract + ~150 across other binary targets. Line coverage held above the ≥92% project bar; net-new v0.6.4 modules at 100% (sizes.rs), 99.50% (profile.rs), 97.58% (cli/audit.rs), 97.05% (cli/doctor.rs), 92.56% (handlers.rs), 92.26% (cli/install.rs). v0.6.3.x baselines (1,809 / 93.08% and 1,886 / 93.84%) remain frozen on the evidence page; v0.6.4 metrics in the release notes and on the test-hub campaign. Empirical NHI discovery acceptance proven separately by the Discovery Gate (T1–T4 matrix vs. live xAI Grok 4.3, 6/6 PASS, GATE GREEN).
LongMemEval benchmark -- 97.8% R@5 (489/500), 99.0% R@10, 99.8% R@20 on ICLR 2025 LongMemEval-S dataset. 499/500 at R@20. Pure FTS5 keyword achieves 97.0% R@5 in 2.2 seconds (232 q/s). LLM query expansion pushes to 97.8% R@5. Zero cloud API costs. See benchmark details.
MCP Prompts -- recall-first and memory-workflow prompts teach AI clients to use memory proactively
TOON-default -- recall/list/search responses use TOON compact by default (79% smaller than JSON)
Criterion benchmarks -- insert, recall, search at 1K scale
GitHub Actions CI/CD -- fmt, clippy, test, build on Ubuntu + macOS, release on tag

Coverage Floor (hard CI gate)

The Code Coverage job is a required status check. CI re-asserts two invariants on every PR: an absolute floor of >= 90% lines (catastrophic-regression backstop, set at the current measurement rounded down to the nearest 5%), and a ratchet against the value pinned in .coverage-baseline with a 0.5% slack window (the day-to-day enforcement). PRs that raise coverage should bump the baseline file in the same commit so future PRs benefit from the new floor; PRs that regress more than 0.5% are blocked from merging. Current measurement: 93.13% lines.

Token-Budget Gate (hard CI gate, v0.7 C5)

The token-budget workflow is a required status check. It enforces three cl100k_base-measured invariants on every PR:

Per-tool ceiling of 1500 tokens -- no single MCP tool's serialized schema (name + description + inputSchema) may exceed 1500 cl100k_base tokens.
Full-profile honest range (5K-8K) -- the v0.6.4 backstop, kept in place to detect pathological shrinkage (accidentally dropping tools).
Full-profile hard ceiling (v0.7 C5, raised post-D1.6/D1.7) -- the trimmed tools/list payload under --profile full may not exceed 11,000 cl100k_base tokens (TRIMMED_FULL_PROFILE_CEILING_TOKENS in tests/token_budget_guard.rs; the original C5 target was 3500 against the pre-D1.6 hand-coded schemas — the schemars-derived D1.6/D1.7 expansion raised the pinned ceiling). C2 (split docs field), C3 (collapse repeated schema boilerplate), and C4 (hide rarely-used optional params) drove the original compaction; this gate forces future PRs that grow the surface to claw back budget elsewhere. Inspect ai-memory doctor --tokens --raw-table to see per-tool costs. See .github/workflows/token-budget.yml and docs/v0.7/schema-compaction-audit.md.

ML and LLM Dependencies (semantic tier+)

candle-core, candle-nn, candle-transformers -- Hugging Face Candle ML framework for native Rust inference
hf-hub -- download models from Hugging Face Hub
tokenizers -- Hugging Face tokenizers for text preprocessing
instant-distance -- approximate nearest neighbor search
reqwest -- HTTP client for LLM-backend communication (smart/autonomous tiers — any provider per #1067: Ollama, xAI, OpenAI, Anthropic, Gemini, DeepSeek, Kimi, Qwen, Mistral, Groq, Together, Cerebras, OpenRouter, Fireworks, LMStudio, vLLM, llama.cpp server)

---

Architecture

---

Benchmark

Evaluated on the ICLR 2025 LongMemEval-S dataset (500 questions, 6 categories). Pure FTS5 keyword tier achieves 97.0% R@5 in 2.2 seconds. LLM query expansion (smart tier) pushes to 97.8% R@5. All inference runs locally — zero cloud API calls, zero cost.

| Tier | R@5 | Speed | Dependencies | |------|-----|-------|-------------| | keyword | 97.0% | 232 q/s | None | | semantic | 97.4% | 45 q/s | Embedding model (~100MB) | | smart | 97.8% | 12 q/s | Any LLM backend (e.g. local Ollama + Gemma 3 4B; or xAI Grok 4.3, OpenAI gpt-5, Anthropic Claude Opus 4.7, Gemini, DeepSeek, etc. post-#1067) |

Performance Budgets (v0.6.4)

Every release ships with published p95/p99 budgets for hot-path operations and a CI gate that fails any PR whose measured p95 exceeds the budget by more than 10 %. Targets are calibrated for M4 reference hardware; full table and methodology in PERFORMANCE.md.

| Operation | Target p95 | Target p99 | |---|---|---| | memory_session_start (Claude Code hook) | < 100 ms | < 200 ms | | memory_store (no embedding) | < 20 ms | < 50 ms | | memory_search (FTS5) | < 100 ms | < 250 ms | | memory_recall (hot, depth=1) | < 50 ms | < 150 ms | | memory_kg_query (depth ≤ 3) | < 100 ms | < 250 ms | | memory_kg_query (depth ≤ 5) | < 250 ms | < 500 ms | | memory_kg_timeline | < 100 ms | < 250 ms |

Run the same workload locally:

ai-memory bench                      # human-readable table
ai-memory bench --json               # machine-parseable

Substrate is unchanged across v0.6.3.x → v0.6.4 (the quiet-tools release ships a smaller default tool surface, not a different hot-path). p99 targets here remain informational pending the next dedicated soak window; latest soak evidence is on the test hub.

---

Integration Methods

MCP (Primary -- for MCP-compatible AI platforms)

MCP is the

Choose your installation path

What's new in v0.7

Substrate-native write-time investment (Batman Forms 1-6 + 7th-form)

Quick wins (Tencent QW-1/2/3)

Attested cortex epic (Tracks A–K)

Recursive-learning + L1/L2 grand-slam wave

Agent identity (NHI) — every memory tells you who learned it

Retroactive conversation import — `ai-memory mine`

Compatible AI Platforms

Install in 60 Seconds

Mobile platform support (v0.7.0 Posture-1a)

Quickstart

SDKs

What Does It Do?

Features

Core

Organization

Interfaces

Operations

Quality

Coverage Floor (hard CI gate)

Token-Budget Gate (hard CI gate, v0.7 C5)

ML and LLM Dependencies (semantic tier+)

Architecture

Benchmark

Performance Budgets (v0.6.4)

Integration Methods

MCP (Primary -- for MCP-compatible AI platforms)

Related MCP servers

MCP servers by category

ai-memory-mcp

Choose your installation path

What's new in v0.7

Substrate-native write-time investment (Batman Forms 1-6 + 7th-form)

Quick wins (Tencent QW-1/2/3)

Attested cortex epic (Tracks A–K)

Recursive-learning + L1/L2 grand-slam wave

Agent identity (NHI) — every memory tells you who learned it

Retroactive conversation import — ai-memory mine

Compatible AI Platforms

Install in 60 Seconds

Mobile platform support (v0.7.0 Posture-1a)

Quickstart

SDKs

What Does It Do?

Features

Core

Organization

Interfaces

Operations

Quality

Coverage Floor (hard CI gate)

Token-Budget Gate (hard CI gate, v0.7 C5)

ML and LLM Dependencies (semantic tier+)

Architecture

Benchmark

Performance Budgets (v0.6.4)

Integration Methods

MCP (Primary -- for MCP-compatible AI platforms)

Related MCP servers

MCP servers by category

Retroactive conversation import — `ai-memory mine`