korg

New1Direction/korg
5 starsMITCommunity

Install to Claude Code

This server doesn't publish a one-line install command. Follow the setup in the source repository.

Summary

Wrap any korg:introspect@v1-aware binary as a Claude Code tool, honoring declared side-effects.

README.md

korg

A causally-ordered, rewindable event-ledger for autonomous AI agents. Every step your AI agent takes, recorded in a hash-chained ledger you can independently verify — tamper-evident, zero trust, no blockchain.

![CI](https://github.com/New1Direction/korg/actions/workflows/ci.yml) ![License: MIT OR Apache-2.0](https://opensource.org/licenses/MIT) ![Rust 2021](https://www.rust-lang.org) ![Tests](https://github.com/New1Direction/korg)

<p align="center"> <b>English</b> · <a href="README.zh-CN.md">简体中文</a> · <a href="README.zh-TW.md">繁體中文</a> </p>

---

!korg demo — record, verify, and rewind an AI agent session as a hash-chained ledger

---

AI agents are black boxes. When they fail, you can't debug. When they succeed, you can't reproduce it. When they do something wrong, you can't undo it. Korg fixes this.

---

What Korg Does

[!NOTE] Universal Ingestion Integration Mode: Korg v1 is an MCP-callable audit sink. Any MCP-compatible coding agent (Claude Code, Codex, etc.) can call korg's tools to record its session as a causally-linked, replayable, rewindable ledger. The agent must be instructed to log its actions — typically via system prompt or MCP server configuration. Fully passive auditing without agent cooperation is on the roadmap for future versions.

[!WARNING] Trust Boundary & Deployment Scope: Korg v1 is designed strictly for local, single-user workspaces. Multi-tenant and networked deployments require cryptographic authentication and permission bounds that are not yet shipped. Running the server on an untrusted or public network exposes workspace read/write access.

Korg is a cognitive hypervisor — a runtime layer that sits beneath your AI agents and governs every decision they make.

It doesn't replace your LLM. It governs what the LLM does.

Foundation Model          →  predicts, suggests, generates
────────────────────────────────────────────────────────────
Korg Cognitive Runtime    →  schedules, validates, isolates,
                             reconciles, replays, heals, governs

Every agent action is:

  • Appended to an immutable, cryptographically-signed ledger
  • Ordered with Hybrid Logical Clocks (causal, deterministic, globally consistent)
  • Replayable — rebuild exact state at any point in history
  • Reversible — rewind the ledger to any prior sequence point

---

Try the Time-Travel Demo

You can run the built-in sandbox demo to see cognitive time-travel in action. The demo sets up a temporary workspace with a buggy Python script, lets a simulated coding agent make a wrong edit, catches the test failure, rewinds the workspace and ledger to before the edit, and speculatively commits the correct fix:

cargo run -- demo

You will see the complete, colorized time-travel sequence:

⚡ STARTING KORG COGNITIVE TIME-TRAVEL DEMO ⚡
────────────────────────────────────────────────────────────────────────────────
[korg] Initializing sandboxed demo environment...
[korg] Created temporary workspace with math_utils.py (subtraction bug present).

🚀 PHASE 1: AGENT INITIATES RUN (WRONG PATH)
  [seq 390] actor: agent:claude-code@0.2.29 | tool: user_prompt | prompt: "Fix subtraction bug and verify tests pass"
  [seq 391] actor: agent:claude-code@0.2.29 | tool: Read        | file: math_utils.py
  [seq 392] actor: agent:claude-code@0.2.29 | tool: Edit        | result: "Modified return a + b (wrong fix)"
  [seq 393] actor: agent:claude-code@0.2.29 | tool: Bash        | command: "pytest" -> ❌ FAILED (2 tests failed)

📊 LEDGER STATE (BEFORE REWIND):
  Before rewind: events 390-393 (prompt, read, edit-wrong, test-failed)
    ├── seq 390 (user_prompt) -> triggered_by: None
    ├── seq 391 (Read) -> triggered_by: Some(390)
    ├── seq 392 (Edit) -> triggered_by: Some(391)
    ├── seq 393 (Bash) -> triggered_by: Some(392)

⏳ PHASE 2: INITIATING REVERSIBLE REWIND TO SEQ 391
  [korg] Truncating journal ledger to sequence ID 391...
  [korg] Restoring workspace snapshot via git read-tree (O(1))...
  [korg] Reset math_utils.py file state back to sequence 391 bug state.
  [korg] Rebuilding 3 read-model projections...

📊 LEDGER STATE (AFTER REWIND):
  After rewind:  events 390-391 (prompt, read)
    ├── seq 390 (user_prompt) -> triggered_by: None
    ├── seq 391 (Read) -> triggered_by: Some(390)

🚀 PHASE 3: AGENT DIVERGES DOWN CORRECT PATH (SPECULATIVE REPLAY)
  [seq 392] actor: agent:claude-code@0.2.29 | tool: Edit        | result: "Modified return a - b (correct fix)"
  [seq 393] actor: agent:claude-code@0.2.29 | tool: Bash        | command: "pytest" -> ✓ PASSED (2 passed)

📊 LEDGER STATE (AFTER DIVERGENT RUN):
  After new run: events 390-393 (prompt, read, edit-right, test-passed)
    ├── seq 390 (user_prompt) -> triggered_by: None
    ├── seq 391 (Read) -> triggered_by: Some(390)
    ├── seq 392 (Edit) -> triggered_by: Some(391)
    ├── seq 393 (Bash) -> triggered_by: Some(392)

✓ DEMO COMPLETE: Time-travel execution succeeded!
  Ledger truncated, workspace rolled back, and a different future was successfully committed.

No other AI agent runtime lets you do this.

---

Core Architecture

Korg is built on the same theoretical foundations that make databases and operating systems reliable — applied to AI cognition for the first time.

| Invariant | What it means | |:---|:---| | Append-only WAL | Every cognitive event is a ledger entry. Nothing is mutated, only appended. Like a database WAL, but for AI thought. | | HLC Causal Ordering | Hybrid Logical Clocks guarantee globally consistent, causally ordered event streams — even across distributed swarm workers. | | Deterministic Replay | Any campaign can be replayed byte-for-byte from the ledger. Same inputs, same outputs, every time. | | Speculative Branches | Fork execution into parallel hypothetical paths. Preview before committing. Discard freely. | | Execution Checkpoints | Snapshot the entire runtime state: ledger offset, projection views, lease maps, workspace tree. Restore in O(1). | | Micro-Healing | Transient failures (lock conflicts, stale state) are automatically healed at the effect level, with full retry audit trails. | | Semantic Governance | Swarm actions are validated against BERT embedding cosine similarity — semantic alignment, not keyword matching. |

┌────────────────────────────────────────────────────────────────┐
│  korg v0.1.0  │  session: 019e5333-efc9-7c70  │  ● ACTIVE      │
├───────────────────────────────┬────────────────────────────────┤
│  SWARM PLAN                   │  LIVE MERKLE LEDGER            │
│  ├─ [●] Captain  [PLANNING]   │  (tx_00)→(tx_01)→[tx_02]→...  │
│  ├─ [●] Harper   [RESEARCH]   │                                │
│  ├─ [●] Benjamin [SYNTHESIS]  │  TELEMETRY                     │
│  └─ [○] Lucas    [IDLE]       │  ├─ Velocity  85.2 t/s  ▇▆▄▂█  │
│                               │  ├─ Entropy    0.451     ▄▃▂▃▄  │
│  GOVERNANCE GATES             │  └─ Progress  68.7 %    ▂▃▄▅▆▇  │
│  ├─ 🟡 Amber Security [IDLE]  │                                │
│  ├─ 🟢 Consensus     [ACTIVE] │  LEDGER STREAM                 │
│  └─ 🔵 Steering Fork [IDLE]   │  [tx_03] Benjamin: patch auth  │
└───────────────────────────────┴────────────────────────────────┘

---

Quick Start

Build from source

The crate is not yet published to crates.io; install from source:

git clone https://github.com/New1Direction/korg
cd korg
cargo build --release
./target/release/korg --help

Python bridge (for korgex / korgchat)

cd crates/korg-bridge
maturin develop  # builds the PyO3 extension into the active venv
python3 -c "import korg_bridge; print(korg_bridge.__version__)"

Run your first campaign

# Interactive TUI dashboard
korg campaign --tui --prompt "Refactor the auth layer to use JWTs"

# Web cockpit at localhost:8080
korg campaign --web --prompt "Optimize the database connection pool"

# Pure autonomous goal mode (--goal is a top-level flag)
korg --goal "Write and validate a full test suite for src/parser.rs"

# Run the full multi-persona swarm on a REAL local model — every persona
# (Captain, Harper, Benjamin, Lucas, Evaluator) runs as a real worker
# subprocess doing real, measured, attested work. Defaults to a hermetic
# deterministic provider; `--provider ollama` makes it live.
korg --goal "Fix the failing test in src/lib.rs" --provider ollama --model qwen2.5:7b

# Preview without committing (dry-run; --preview is a top-level flag)
korg --preview "Refactor the main event loop"

Rewind & Verify

# Rewind the capability journal to a specific ledger sequence point
korg rewind --seq 4

# Drive the honest pipeline on a fixture and emit a verifiable ledger
korg run-once "Fix the add function in src/lib.rs so it adds"

# Same pipeline, but with a REAL local model (ollama) on an arbitrary task —
# the model writes the patch, Korg applies it, measures the real git diff +
# `cargo check`, and attests only what actually changed.
korg run-once "Fix the bug in src/lib.rs: max() returns the minimum.
Output the COMPLETE corrected src/lib.rs:
\`\`\`rust
$(cat your-repo/src/lib.rs)
\`\`\`" --repo your-repo --provider ollama --model qwen2.5:7b

# Independently verify any korg-ledger@v1 journal (no trust in the producer)
korg-verify <path-to-ledger.jsonl>

Honest by construction, with any model. The default provider is a hermetic deterministic stub (fixture-only, zero dependencies). --provider ollama runs a real local model on arbitrary tasks — Korg asks OpenAI-compatible providers for strictly valid JSON (response_format: json_object), so even a small (7B) local model lands a real patch reliably (measured 5/5 with qwen2.5:7b). Either way the attestation is measured, never fabricated: when the model produces a patch, the ledger attests the real git diff file count and changed paths; if it declines or writes a non-compiling change, Korg reports it honestly (an honest null — zero changed, zero attested — or a failed cargo check). The pipeline cannot attest a number the worktree does not actually show — that is the guarantee, independent of model quality.

Verify it in your browser — sends nothing. Zero-install, client-side verifiers (Web Crypto) for any korg-ledger@v1 journal or Certificate: verify a session · verify a Certificate · time-travel explorer. They hash-chain, check the causal DAG, validate Ed25519 signatures, and re-derive the human summary from the events — all locally.

Speculative branch/fork and named checkpoints (korg fork, korg checkpoints list|restore) are planned, not yet shipped. The reversibility surface today is korg rewind.

---

Cognition Modes

Korg adapts its intelligence tier based on task complexity. Modes are governed exclusively through the capability resolver — every switch is ledger-logged.

| Mode | Best for | |:---|:---| | instant | Ultra-low latency. Bypasses negotiation. Optimistic execution. | | balanced | Default. Structured multi-round contract negotiation. | | heavy | Deep multi-agent deliberation. Multiple evaluation rounds. | | research | Wide divergent exploration. Semantic index scanning across all crates. | | recovery | Safe rollback mode. Creates checkpoints before every mutation. | | autonomous | Full goal-mode. Self-steering with automatic re-planning. | | heavy-consciousness | Maximum depth. Full HeavyConsciousness context injection. |

korg --mode research "Explore alternative approaches to the rate limiter"
korg --mode recovery "Carefully migrate the database schema"

---

Why Korg Exists

Current AI coding agents are probabilistic black boxes. They:

  • Can't be replayed — same prompt, different output, every time
  • Can't be rewound — one wrong action and you're manually diffing git history
  • Can't be audited — no record of what the agent decided and why
  • Can't be governed — no way to set policy boundaries at runtime

Korg treats AI cognition the same way a hypervisor treats compute and Git treats code:

If it's not in the ledger, it didn't happen.

---

Comparison

| Capability | Korg | LangChain / LangGraph | CrewAI | Standard CLI Agents | |:---|:---:|:---:|:---:|:---:| | Deterministic replay | ✅ | ❌ | ❌ | ❌ | | Causal HLC ordering | ✅ | ❌ | ❌ | ❌ | | Rewind execution | ✅ | ❌ | ❌ | ❌ | | Speculative branches | 🚧 planned | ❌ | ❌ | ❌ | | Execution checkpoints | 🚧 planned | ❌ | ❌ | ❌ | | Cryptographic audit trail | ✅ | ❌ | ❌ | ❌ | | Independently-verifiable Certificate | ✅ | ❌ | ❌ | ❌ | | Honest attestation (real diff, never fabricated) | ✅ | ❌ | ❌ | ❌ | | Micro-healing | ✅ | ❌ | ❌ | ❌ | | Model-agnostic | ✅ | ✅ | ✅ | ✅ |

Korg is not an agent framework. It's the governance kernel that runs beneath all of them.

---

Technical Stack

| Component | Technology | |:---|:---| | Core runtime | Rust 2021, Tokio async | | Ledger ordering | Hybrid Logical Clocks (HLC) | | Workspace snapshots | Git Merkle tree (O(1) restore via write-tree / read-tree) | | Cryptographic attestation | Ed25519 (ed25519-dalek) | | Semantic governance | BERT cosine similarity via the optional candle feature (Hugging Face); a deterministic embedding fallback runs when candle is not built | | TUI dashboard | Ratatui + Crossterm | | Web cockpit | Axum + SSE | | Syntax highlighting | Syntect + tree-sitter |

---

Architecture Deep Dive

Read the full technical write-up

Real-World Audit Ledger Example

You can inspect a real-world cognitive audit ledger produced by Korg. This NDJSON file records a live session where Claude Code was prompted to call Korg's MCP tools to refactor a function and rename all call sites, capturing the full HLC causal graph and actor_id recorder metadata:

The short version:

  1. CapabilityResolver — the single authority for all runtime state. All reads and writes flow through it. No secondary state stores.
  2. CapabilityJournal — the append-only WAL. Every cognitive event is sealed here with an HLC timestamp, causation chain, and cryptographic signature.
  3. ProjectionEngine — pure state folds over the journal. Any read model can be rebuilt deterministically from the raw event stream.
  4. ExecutionCheckpoint — snapshot of {ledger_offset, projection_state, lease_map, workspace_tree_hash}. Restores full runtime state in O(1) without replaying the entire event stream.
  5. CapabilityExecutor — executes the physical effect DAG. Failures trigger automatic micro-healing before escalating.

System overview

flowchart TD
    Agent["MCP-compatible agent<br/>(Claude Code, Codex, korgex)"]

    subgraph Ingest["Ingestion paths"]
        MCP["mcp_server.py<br/>(MCP / JSON-RPC stdio sink)"]
        Bridge["korg-bridge<br/>(PyO3 in-process writer)"]
        Server["korg-server<br/>(Axum HTTP + SSE)"]
    end

    Journal["korg-registry · CapabilityJournal<br/>append-only WAL · HLC order · triggered_by DAG"]
    Chain["korg-ledger@v1<br/>hash-chain: prev_hash to entry_hash<br/>SHA-256 / HMAC + Ed25519"]
    Projection["ProjectionEngine<br/>pure folds to read models"]
    Rewind["rewind / rewind_with_seal<br/>truncate to seq + LedgerRewind tip"]
    Verify["korg-verify (+ Python / JS)<br/>verify_chain · verify_dag · sig · Certificate"]
    Runtime["korg-runtime<br/>multi-persona swarm · git-worktree sandbox<br/>arena · evaluator · run_once"]

    Agent --> MCP --> Journal
    Agent --> Bridge --> Journal
    Agent --> Server --> Journal
    Runtime --> Journal
    Journal --> Chain
    Journal --> Projection
    Journal --> Rewind
    Rewind --> Projection
    Chain --> Verify

---

Status

Korg is in active development, built on a frozen korg-ledger@v1 spec with cross-language conformance (Rust + Python + JS). Test footprint: 300+ Rust tests across the workspace plus Python/JS conformance suites, CI-gated (build · tests · cross-language oracle · differential fuzz) and green on main.

Shipped:

  • [x] Append-only, hash-chained cognitive ledger with HLC ordering
  • [x] Deterministic replay and projection rebuilds
  • [x] Reversible execution — rewind the ledger to any prior sequence point (tamper-evident LedgerRewind)
  • [x] Per-event Ed25519 signatures + structural anchoring (korg-ledger@v1 §8)
  • [x] Certificate (korgcert@v1) — a public, independently-verifiable certificate of agent work, with zero-install in-browser verifiers
  • [x] Honest pipeline (korg run-once) — real patch → real cargo check → an attested mutation count that equals the real git diff; never fabricates (reports an honest null instead)
  • [x] Live local model (--provider ollama) — real per-persona work on arbitrary tasks
  • [x] Multi-agent swarm (Captain, Harper, Benjamin, Lucas, Evaluator) — genuine worker subprocesses doing real, measured, attested work with DAG data-flow between personas
  • [x] Zero-config Claude Code capture (PostToolUse/Stop hooks → verifiable per-session ledgers)
  • [x] Micro-healing effect layer · TUI dashboard + Web cockpit
  • [x] Cryptographic provenance attestation · single-authority CognitionMode governance
  • [x] Preview / dry-run mode (--preview)

Planned / not yet shipped:

  • [ ] Speculative branches / fork + execution-checkpoint restore CLI (primitives exist; CLI planned)
  • [ ] cargo install korg on crates.io · npm-published verifier
  • [ ] Live network anchoring resolver (trusted-time witness — the remaining honest limit)
  • [ ] Remote swarm workers · WASM backends · IDE language-server integration · distributed checkpoint sync
  • [ ] Fully passive capture without agent cooperation

---

License

Licensed under either of MIT or Apache-2.0 at your option.

---

<p align="center"> <sub>Built with Rust. Governed by invariants. No black boxes.</sub> </p>

Related MCP servers

Browse all →