openclaw-brain

An engineering knowledge-graph + memory system — the memory and guardrails for an AI circuit-design mentor.

openclaw-brain ingests semiconductor PDFs (textbooks, papers), extracts concepts / equations / typed relationships via LLMs, and stores them in a Neo4j knowledge graph. It is exposed as an MCP server so any MCP-compatible agent (OpenClaw, Claude Code, …) can query domain knowledge, verify it against the original source text, and write its own design reasoning back into the graph.

What it does

Ingest — PDF → typed knowledge graph through an 11-stage pipeline

(parse → figures → chunk → extract → ground → match → reason → reconcile → commit → embed → summarize). Only two stages do the heavy "understanding" (an LLM); the rest are mechanical.

Serve — exposes ~24 MCP tools: query_knowledge, get_evidence, answer_question,

record_hypothesis / record_decision / record_bench_result, merge_concepts, retract_node, …

Ground — every node is named, typed, confidence-scored, and traceable to the exact source chunk;

a grounding stage drops claims the chunk text doesn't support.

A single unit of the graph looks like this — a real node and a real typed edge, exactly as they sit in the graph:

(Cascode Device) ──[ SOLVES_PROBLEM ]──> (Power Supply Rejection)
  confidence 0.70 · layer L2 (analog/EDA) · evidence: chunk_c635e958d19e
  rationale: "cascode devices raise effective output resistance, improving supply rejection (PSRR)…"

The honest bottom line

The project started with one bet — "make a cheap local model reason like an expensive one" — and measured it false. Because the failure was measured cleanly, two things that genuinely ship came out of it: (1) a grounding / fabrication-control mechanism that drops source-unsupported claims, with measured fabrication near-zero on the evaluation arms (the production-graph fabrication is not yet separately measured), and (2) a debugging discipline that catches when the measurement instrument itself is lying. The full development log — including the dead-ends and the numbers — is in docs/DEVLOG.md.

Quickstart

Requires Python 3.11+ and Neo4j 5.

python -m venv .venv && .venv/bin/pip install -e .
docker compose up -d                        # Neo4j on :7687
.venv/bin/openclaw-brain apply-schema       # constraints + vector indexes

.venv/bin/openclaw-brain serve              # MCP server (stdio — used by the agent)
.venv/bin/openclaw-brain status             # Neo4j health + node counts
.venv/bin/openclaw-brain export-obsidian    # graph → browsable Obsidian vault (~/Semiconductor)

Ingesting a PDF and asking questions both happen through the agent calling MCP tools (ingest_pdf(file_path=…), query_knowledge(query=…)); the full tool list is in src/openclaw_brain/server/mcp_server.py.

Architecture

src/openclaw_brain/
├── agent.py             # BrainAgent — the single public API (all MCP tools delegate here)
├── knowledge/           # pipeline · extraction · reasoning · graph store (Neo4j)
├── memory/              # episodic / semantic / procedural memory + promotion
├── llm/                 # provider (model catalog) + resilience (retry / fallback)
└── server/mcp_server.py # FastMCP server exposing BrainAgent as MCP tools

Routing is local-first: shallow stages run on local/cheap models, the depth-bearing extract and reason stages run on a cheap hosted model (deepseek-v4-flash), and frontier models (Opus / Codex) are used only as the teacher/ceiling. The authoritative stage→model config lives in config/default.toml. See CLAUDE.md for the full module map and docs/DECISIONS.md for the architecture decision records.

Status

Production graph rebuilt clean on deepseek-v4-flash: 5 sources (Razavi textbook + 4 CIS papers) → 4,336 concepts, 2,268 circuit topologies, 581 equations. Knowledge is stored as natural language (concept descriptions + ~19k typed-edge rationales + a verbatim EvidenceVault); embeddings are a rebuildable index, not the asset of record.

.venv/bin/python3 -m pytest tests/ -q         # Neo4j-backed tests auto-skip without a DB

openclaw-brain

openclaw-brain

What it does

The honest bottom line

Quickstart

Architecture

Status

License

Related MCP servers

MCP servers by category