AXON Epistemic MCP

Bemarking/axon-lang
3 starsAGPL-3.0Community

Install to Claude Code

This server doesn't publish a one-line install command. Follow the setup in the source repository.

Summary

Official ℰMCP server for AXON — exposes 45 primitives, 33 templates, 17 examples to AI agents.

README.md

<p align="center"> <strong>AXON</strong> <em>v2.3.0</em><br> The first formal cognitive language for AI — a 100% Rust + C23 native runtime with <strong>Cognitive I/O</strong>, real-time streaming, first-class HTTP endpoints, a four-pillar cognitive data plane, and <strong>session-typed WebSocket dialogue</strong> as a cognitive primitive (Caires–Pfenning linear-logic Curry–Howard). </p>

<p align="center"> <!-- Cognition primitives --> <code>persona</code> · <code>intent</code> · <code>flow</code> · <code>reason</code> · <code>anchor</code> · <code>refine</code> · <code>memory</code> · <code>tool</code> · <code>probe</code> · <code>weave</code> · <code>validate</code> · <code>context</code><br> <code>know</code> · <code>believe</code> · <code>speculate</code> · <code>doubt</code> · <code>par</code> · <code>hibernate</code><br> <code>dataspace</code> · <code>ingest</code> · <code>focus</code> · <code>associate</code> · <code>aggregate</code> · <code>explore</code><br> <code>deliberate</code> · <code>consensus</code> · <code>forge</code> · <code>agent</code> · <code>shield</code><br> <code>stream</code> · <code>effects</code> · <code>@contract_tool</code> · <code>@csp_tool</code><br> <code>pix</code> · <code>navigate</code> · <code>drill</code> · <code>trail</code> · <code>corpus</code><br> <code>psyche</code> · <code>ots</code><br> <code>mcp</code> · <code>taint</code> · <code>mandate</code> · <code>lambda</code><br> <code>compute</code> · <code>logic</code><br> <code>daemon</code> · <code>listen</code><br> <code>axonendpoint</code> · <code>axpoint</code> · <code>axonstore</code> · <code>apx</code><br> <!-- Cognitive I/O & compliance (λ-L-E calculus, Phases 1–9) --> <strong>Cognitive I/O:</strong> <code>resource</code> · <code>fabric</code> · <code>manifest</code> · <code>observe</code> · <code>reconcile</code> · <code>lease</code> · <code>ensemble</code><br> <code>topology</code> · <code>session</code> · <code>send</code> · <code>receive</code> · <code>select</code> · <code>branch</code> · <code>immune</code> · <code>reflex</code> · <code>heal</code> · <code>compliance</code><br> <code>component</code> · <code>view</code><br> <!-- NEW — Session-typed WebSocket dialogue (Fase 41, v2.3.0) --> <strong>Session types (v2.3.0):</strong> <code>socket</code> · <code>send T</code> · <code>receive T</code> · <code>select {ℓᵢ:…}</code> · <code>branch {ℓᵢ:…}</code> · <code>backpressure: credit(k)</code> · <code>reconnect: cognitive_state</code> </p>

<p align="center"> <img src="https://img.shields.io/badge/version-v2.3.0-informational" alt="Version"> <img src="https://img.shields.io/badge/status-production-brightgreen" alt="Status: Production"> <img src="https://img.shields.io/badge/runtime-100%25%20Rust%20%2B%20C23-orange" alt="100% Rust + C23"> <img src="https://img.shields.io/badge/streaming-SSE%20%7C%20NDJSON%20%7C%20WebSocket-brightgreen" alt="Streaming"> <img src="https://img.shields.io/badge/realtime-session--typed-purple" alt="Session types"> <img src="https://img.shields.io/badge/tests-2245%20axon--lang%20%2B%20535%20frontend%20%2B%2013%20csys-brightgreen" alt="Tests"> <img src="https://img.shields.io/badge/compliance-HIPAA%20%7C%20PCI__DSS%20%7C%20GDPR%20%7C%20SOX%20%7C%20SOC2%20%7C%20ISO27001%20%7C%20FIPS%20%7C%20CC%20EAL4%2B-blueviolet" alt="Compliance"> <img src="https://img.shields.io/badge/persistence-postgresql-blue" alt="PostgreSQL"> <img src="https://img.shields.io/badge/observability-tracing-green" alt="Tracing"> <img src="https://img.shields.io/badge/license-AGPL--3.0--or--later-lightgrey" alt="License"> </p>

---

Two repositories, two version lines. This repo (axon-lang, AGPL-3.0-or-later, public) ships the language + runtime + compiler + 7 LLM backends + Cognitive I/O + WebSocket session types — currently v2.3.0. The commercial control plane (axon-enterprise, EULA, private) layers multi-tenant identity / RBAC / SSO / metering / audit / vertical compliance on top of this language via a pinned Cargo dependency — currently v3.0.7. The version numbers diverge by design (enterprise iterates on the SaaS surface independently of the language). If you don't run a commercial Axon deployment, this repo is all you need; the badge above is the only version that matters for you.

---

What is AXON?

AXON is a compiled language that targets LLMs instead of CPUs. It has a formal EBNF grammar, a lexer, parser, AST, intermediate representation, seven native Rust LLM backends (Anthropic, OpenAI, Gemini, Kimi, GLM, Ollama, OpenRouter), and a 100% Rust + C23 native runtime with semantic type checking, an algebraic-effects execution engine, real-time SSE / NDJSON / WebSocket session-typed streaming, retry + circuit-breaker resilience, and execution tracing. The FIPS-routable cryptographic + tokenisation kernels live in axon-csys as standalone C23 (no opaque C bindings — every kernel is a _Generic-dispatched, [[nodiscard]]-annotated, sanitizer-clean C23 source file with a Rust wrapper).

Beyond cognition, AXON ships Cognitive I/O — a λ-calculus-based infrastructure layer where resources, control loops, observability, security kernels, and UI components carry their regulatory class (HIPAA / PCI_DSS / GDPR / SOX / SOC 2 / ISO 27001 / FIPS / CC EAL 4+) as a compile-time type. Programs that fail coverage are rejected before they run. No other programming language does this.

v2.3.0 (Fase 41) adds the first session-typed real-time dialogue primitive in any production language: declare a session (the bidirectional protocol), bind it to a socket (the WebSocket transport with credit-refined backpressure), and the compiler proves the two endpoints are duals (Caires–Pfenning linear-logic Curry–Howard); the runtime enforces every step, seals the residual cursor on disconnect for typed reconnection, and projects to W3C Server-Sent Events when the protocol is single-polarity. See docs/paper_websocket_cognitive_primitive.md.

It is not a Python library, a LangChain wrapper, a YAML DSL, or a Terraform replacement. It is a new kind of calculus — see docs/paper_lambda_lineal_epistemico.md for the formal semantics (Cálculo Lambda Lineal Epistémico).

---

Cognitive I/O — Build Infrastructure with Compile-Time Compliance

The big differential added in v1.0 — ten new top-level declarations that turn AXON into the only language where "does this app leak PHI?" is a type error, not a post-mortem finding.

| Primitive | What it is | Formal backing | |---|---|---| | resource | Infrastructure token (DB, cache, bucket, GPU) with linear / affine / persistent lifetime | Linear Logic (Girard 1987) | | fabric | Topological substrate (VPC, cluster, namespace) | Separation Logic (O'Hearn–Reynolds) | | manifest | Declarative "belief" about desired infrastructure shape, with κ (regulatory class) annotations | Epistemic Logic (Fagin–Halpern) | | observe | Quorum-gated snapshot of real state, producing a ΛD envelope ⟨c, τ, ρ, δ⟩ | Decision D4: partition ≡ void, never doubt | | reconcile | Active-Inference control loop: observe → drift → shield → act | Free Energy Principle (Friston) | | lease | τ-decaying affine capability; post-expiry use is a CT-2 Anchor Breach | Hybrid affine + revocation (D2) | | ensemble | Byzantine quorum aggregator over N observations with common-knowledge fusion | Fagin–Halpern | | topology + session | Typed directed graph over declared entities with Honda–Vasconcelos duality + deadlock detection | π-calculus binary sessions | | socket (v2.3.0) | Session-typed WebSocket transport with credit-refined backpressure, typed reconnection via cognitive_states, SSE-as-fragment projection | Caires–Pfenning Curry–Howard + Rast credit-refined types (paper_websocket_cognitive_primitive.md) | | immune + reflex + heal | KL-divergence anomaly sensor + O(1) signed-trace motor response + Linear-Logic one-shot patch FSM | Cognitive Immune System (paper_immune_v2.md) | | component + view | Declarative UI with the same compile-time κ coverage rule — regulated types need a covering shield or the compiler rejects | Regulatory Type Theory (Fase 9) |

Hard differentiators vs. Terraform / Pulumi / Kubernetes manifests

  1. Compile-time compliance. shield<HIPAA> / type PatientRecord compliance [HIPAA, GDPR] are types. A .axon program that sends PHI to an unshielded endpoint fails axon check — same exit code as a syntax error.
  2. Blame Calculus (Findler–Felleisen). Every error is classified as CT-1 (axon/runtime bug), CT-2 (program author: anchor breach, expired lease), or CT-3 (infrastructure: partition, missing credential, provider quota). No silent downgrades.
  3. Audit-ready artefacts. axon dossier + axon sbom + axon audit --framework {soc2,iso27001,fips,cc,all} + axon evidence-package produce byte-identical, deterministic JSON/ZIP — the SHA-256 of every output is a contract against your release.
  4. 100% Rust + C23 runtime, no interpreter. The whole stack — lexer, parser, type-checker, IR, the algebraic-effects execution engine, the HTTP server, the seven LLM backends, the streaming wire, the session-typed WebSocket driver — is a single native Rust binary; the FIPS-routable cryptographic + tokeniser kernels live in axon-csys as standalone C23 (no unsafe glue: _Generic-dispatched headers, [[nodiscard]] everywhere, sanitizer-clean, valgrind-clean). Download a prebuilt or cargo build --release. No GC, no interpreter, no runtime dependency.
  5. Cognitive immune system. immune + reflex + heal is a first-class language primitive, not a plug-in. Signed HMAC traces per firing, three compliance modes (audit_only / human_in_loop / adversarial), Linear-Logic patch FSM preventing double-application.
  6. Post-Quantum-ready ESK. HMAC-SHA256 baseline + Ed25519 + ML-DSA-65 (NIST FIPS 204 Dilithium) + Hybrid signer (NIST SP 800-208 transition posture). Feature-gated; no silent classical fallbacks.
  7. Persistence is a typed cognitive primitive. A database in AXON is an axonstore, not an ORM bolt-on: retrieved rows are born epistemically Untrusted and a confidence_floor is enforced at read and write; every mutation appends to an HMAC-Merkle audit chain; retrieve is a bounded, back-pressured Stream<Row>; and store access is capability-typed and checked at compile time. No other language treats stored data this way.
  8. Real-time dialogue is a typed cognitive primitive (v2.3.0). A WebSocket in AXON is a socket over a declared session, not a JSON envelope over bytes: the compiler proves the two endpoints are duals (Caires–Pfenning intuitionistic linear logic, S̄ ≡ S⊥), so the connection is deadlock-free and protocol-conformant by construction; credit-refined backpressure (backpressure: credit(k)) is decidable in Presburger arithmetic at compile time; a mid-protocol disconnect seals the residual session-type cursor + credit window into an AAD-bound cognitive_states snapshot, and the typed ?resume= resume restores it under tenant + flow_id binding; a single-polarity protocol's socket ALSO speaks W3C Server-Sent Events byte-compat with Fase 33's existing SSE pipeline (S_SSE = Π_↓(S_WS)).

External audit readiness

The audit engine ships 108 mapped controls across the four major external frameworks:

Each framework has an operational runbook (docs/compliance/runbook_*.md) and a CI workflow (.github/workflows/audit_evidence.yml) that emits the evidence ZIP on every release.

Try it in 30 seconds

pip install axon-lang           # or: download the Rust binary from Releases
echo 'type PatientRecord compliance [HIPAA, GDPR] { ssn: String }
shield PHIShield { scan: [pii_leak] on_breach: halt severity: critical
                   compliance: [HIPAA, GDPR] }
axonendpoint Api { method: POST path: "/p" body: PatientRecord
                   execute: F output: PatientRecord shield: PHIShield
                   compliance: [HIPAA, GDPR] }
flow F(r: PatientRecord) -> PatientRecord {
  step R { ask: "summarize" output: PatientRecord } }' > app.axon
axon check   app.axon   # compile-time compliance verification
axon dossier app.axon   # regulatory posture JSON
axon audit   app.axon --framework all   # per-framework gap analysis

Remove the shield line and axon check fails with "endpoint 'Api' sends regulated type '{HIPAA, GDPR}' without a covering shield — ESK Fase 6.1 coverage rule". That failure is a type error, not a lint warning.

Reference programs

Academic references

---

Production Status

AXON v2.3.0 is production-ready. The full stack is cross-validated, 100% Rust + C23:

  • ✅ 65+ cognitive + Cognitive-I/O primitives wired to the native runtime
  • ✅ 285 HTTP routes tested end-to-end
  • ✅ Seven native Rust LLM backends (Anthropic, OpenAI, Gemini, Kimi, GLM, Ollama, OpenRouter) with full async streaming
  • ✅ Real-time streaming wire — SSE + NDJSON, type-driven transport inference, per-chunk algebraic-effect dispatch
  • Session-typed WebSocket dialogue (v2.3.0) — declared session + socket, statically-checked duality (Caires–Pfenning), credit-refined backpressure (Presburger discharge), typed reconnection via AAD-bound cognitive_states snapshots, SSE-as-fragment unification
  • Multiparty projection (v2.3.0)GlobalType + project_all (Honda–Yoshida–Carbone safe-realizability gate) for n-agent skill/tool topologies
  • axonendpoint as a first-class HTTP REST primitive — typed routes, body + output schema validation, Idempotency-Key, auth scopes
  • axonstore cognitive data plane — epistemically typed rows, HMAC-Merkle audit chains, Stream<Row>, capability-typed access
  • ✅ Compile-time regulatory compliance for HIPAA / PCI_DSS / GDPR / SOX / SOC 2 / ISO 27001 / FIPS / CC EAL 4+
  • ✅ Cognitive immune system (anomaly detection + reflex + heal) paper-faithful
  • ✅ Post-Quantum signatures: HMAC-SHA256 baseline + Ed25519 + ML-DSA-65 + Hybrid (NIST SP 800-208)
  • axon-csys C23 kernels — FIPS-routable SHA-256 / HMAC-SHA256 / SIMD G.711 / BPE tokeniser / FSM dispatch (computed gotos) / buffer pool (207× faster than Vec<u8>) — standalone C23 with sanitizer-clean + valgrind-clean CI lanes
  • ✅ PostgreSQL persistence with migrations and health checks
  • ✅ Structured observability (JSON logging + request tracing)
  • ✅ LLM call resilience (retry + circuit breaker + fallback)
  • 2,245 axon-lang + 535 axon-frontend + 13 axon-csys = 2,793 Rust tests; cross-stack zero-regression discipline. Python side is now a thin PyPI wrapper that downloads + invokes the native Rust binary (the language interpreter is 100% Rust/C23 — the Python suite was retired in Fase 40's Pure Silicon pivot).
  • ✅ Zero "por ahora", zero "lo mínimo" — production-complete

Designed for cognitive AI applications that require formal semantics, reliability, epistemic rigor, and provable regulatory coverage.

persona LegalExpert {
    domain: ["contract law", "IP", "corporate"]
    tone: precise
    confidence_threshold: 0.85
    refuse_if: [speculation, unverifiable_claim]
}

anchor NoHallucination {
    require: source_citation
    confidence_floor: 0.75
    unknown_response: "Insufficient information"
}

⚠️ enforce is the behavioral carrier in anchors. It is the ONLY anchor field injected as a direct behavioral directive to the LLM. require/reject are post-generation validation constraints. description is metadata-only — it does NOT reach the model. Use enforce for text that must shape the model's behavior.

flow AnalyzeContract(doc: Document) -> StructuredReport {
    step Extract {
        probe doc for [parties, obligations, dates, penalties]
        output: EntityMap
    }
    step Assess {
        reason {
            chain_of_thought: enabled
            given: Extract.output
            ask: "Are there ambiguous or risky clauses?"
            depth: 3
        }
        output: RiskAnalysis
    }
    step Check {
        validate Assess.output against: ContractSchema
        if confidence < 0.8 -> refine(max_attempts: 2)
        output: ValidatedAnalysis
    }
    step Report {
        weave [Extract.output, Check.output]
        format: StructuredReport
        include: [summary, risks, recommendations]
    }
}

---

Native Rust + C23 Runtime

AXON v2.3.0 ships a production-hardened 100% Rust + C23 native runtime server with 285+ HTTP routes, 65+ primitives wired to runtime, an algebraic-effects execution engine, a real-time SSE / NDJSON / session-typed WebSocket streaming wire, a full ℰMCP (Epistemic Model Context Protocol) implementation, PostgreSQL persistence, structured observability via tracing, LLM call resilience (retry + circuit breaker + fallback chains across seven native backends), and a complete native CLI (check, compile, run, serve, parse, dossier, sbom, audit, evidence-package, and more).

The Rust + C23 stack is the canonical implementation of the language. Fase 40 (Pure Silicon, v2.0.0) retired the Python interpreter — the original pip install axon-lang package is now a thin wrapper that downloads + invokes the native Rust binary. The FIPS-routable cryptographic + tokeniser kernels live in axon-csys as standalone C23 (no unsafe glue: _Generic-dispatched headers, [[nodiscard]] everywhere, sanitizer-clean + valgrind-clean CI lanes).

Production Foundation (Phase K):

  • Observability: JSON structured logging with request tracing, daily log rotation, configurable levels
  • Resilience: Exponential backoff retry, per-provider circuit breakers, configurable fallback chains across 7 LLM backends
  • Persistence: Full PostgreSQL integration with embedded migrations, JSONB storage, in-memory fallback for development

Quickstart

# Build the native runtime
cd axon-rs
cargo build --release

# Start the server with default in-memory storage
cargo run --release -- --port 3000

# Or with PostgreSQL persistence + structured logging
DATABASE_URL="postgresql://user:pass@localhost/axon" \
cargo run --release -- \
  --port 3000 \
  --log-format json \
  --log-file ./logs \
  --database-url "$DATABASE_URL"

# Deploy a flow
curl -X POST http://localhost:3000/v1/deploy \
  -H "Content-Type: application/json" \
  -d '{"source": "flow analyze { step reason { prompt: \"Analyze the input\" } }", "backend": "stub"}'

# Execute
curl -X POST http://localhost:3000/v1/execute/analyze

# MCP endpoint (JSON-RPC 2.0)
curl -X POST http://localhost:3000/v1/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

Phase K — Production Hardening (v1.0.0 foundation)

AXON v1.0.0 launched with three production-critical systems that remain the foundation of every subsequent minor release:

1. Observability (K1)

  • Structured logging via tracing crate with JSON output
  • Request tracing with UUID correlation (x-request-id header)
  • Daily log rotation with configurable directory
  • Configurable levels via AXON_LOG env or --log-level CLI
  • Instrumentation on all LLM calls: backend, model, latency_ms, tokens_in/out

2. Resilience (K2)

  • Exponential backoff retry (500ms base, 2.0x multiplier, 30s max, jitter)
  • Per-provider circuit breaker (5 failures → Open, 30s cooldown → HalfOpen, 2 successes → Closed)
  • Retry-After header respecting for rate limit hints
  • Fallback chains (e.g., anthropic → openrouter → ollama)
  • Error classification (retryable vs. terminal)
  • Covers all 7 LLM backends: Anthropic, OpenAI, Gemini, Kimi, GLM, OpenRouter, Ollama

3. Persistence (K3-K4)

  • PostgreSQL backend with full ACID semantics
  • 12 domain tables (traces, sessions, daemons, audit_log, axon_stores, dataspaces, hibernations, event_history, execution_cache, cost_tracking, schedules, backend_registry)
  • 15 performance indexes for query optimization
  • Embedded migrations (zero external DB setup for development)
  • UPSERT semantics for idempotent writes
  • JSONB storage for nested structures
  • In-memory fallback when DATABASE_URL unset (perfect for development & CI/CD)

Write-through pattern ensures all state mutations (flows, sessions, daemons, hibernations) persist to PostgreSQL while maintaining fast in-process reads.

Architectural Decisions

Storage Pattern: StorageDispatcher Enum

  • Uses concrete dispatch via StorageDispatcher enum instead of dyn Trait
  • Enables zero-cost abstraction: PostgresBackend or InMemoryBackend at compile time
  • No runtime trait object overhead, full optimization from compiler

Async/Await Safety

  • All storage operations are async, but never held across await points
  • Mutex locks are released before database I/O
  • Prevents deadlocks and enables high concurrency

Graceful Fallback

  • Database connection failures don't crash the server
  • Automatic fallback to InMemoryBackend (with logging)
  • State persists in-memory for the process lifetime
  • Clients experience no service interruption

---

Runtime Surface

| Surface | Count | |---------|-------| | HTTP API routes | 285 | | Language primitives | 65+ (cognitive + Cognitive I/O) | | LLM backends | 7 (anthropic, openai, gemini, kimi, glm, openrouter, ollama) — native async, all streaming | | MCP tool types | 8 (flow, dataspace, axonstore, shield, corpus, compute, mandate, forge) | | MCP resource types | 10 (traces, metrics, backends, flows, dataspaces, axonstores, shields, corpora, mandates, forges) | | MCP workflow prompts | 5 (research, decide, secure_transfer, reflect, analyze_image) | | axon-lang (axon-rs) lib tests | 2,245 | | axon-frontend lib tests | 535 (incl. §41.a-c algebra + §41.h multiparty) | | axon-csys lib tests | 13 (Rust wrapper; C23 kernels exercised by cargo test + sanitizers/valgrind in CI) | | Rust workspace total | 2,793 — zero regressions | | Python wrapper tests | 16 (the thin PyPI wrapper that downloads + invokes the native binary; the interpreter was retired in Fase 40) | | SQL tables | 12 (traces, sessions, daemons, audit_log, axon_stores, dataspaces, hibernations, event_history, execution_cache, cost_tracking, schedules, backend_registry) | | Performance indexes | 15 |

ΛD (Lambda Data) — Epistemic Guarantees

Every AXON operation carries a formal epistemic envelope ψ = ⟨T, V, E=⟨c, τ, ρ, δ⟩⟩:

  • Theorem 5.1: Only raw data may carry certainty c=1.0; all derived operations cap at c≤0.99
  • Epistemic Lattice: ⊥ ⊑ doubt ⊑ speculate ⊑ believe ⊑ know
  • Blame Calculus: CT-2 (caller) / CT-3 (server) / Network attribution on every error
  • CSP §5.3: MCP tools carry constraint satisfaction schemas

TypeScript SDK

import { AxonClient } from "@axon/mcp-client";

const client = new AxonClient({ baseUrl: "http://localhost:3000" });
await client.initialize();

// Discover and call tools
const tools = await client.listTools();
const result = await client.callTool("axon_compute_evaluate", { expression: "pi * 2" });
const envelope = AxonClient.extractEnvelope(result);
console.log(envelope?.certainty); // 0.99 (transcendental → derived)

// Read resources
const backends = await client.readResource("axon://backends");

// Get workflow prompts
const prompt = await client.getPrompt("workflow:research", { question: "How does attention work?" });

Full language specification: docs/axon_language_specification.md

---

Paradigm Shifts

AXON's compiler-level paradigm shifts elevate the language from prompt compilation to a Cognitive Operating System.

I. Formal Model — Epistemic Constraint Calculus

Each program P in AXON operates over a typed epistemic lattice (T, ≤) where the compiler enforces semantic constraints at compile time. The paradigm shifts extend this with three new formal mechanisms:

Epistemic Scoping Function. Given an epistemic mode m ∈ {know, believe, speculate, doubt}, the compiler applies a constraint function C(m) that maps to a tuple of LLM parameters and auto-injected anchors:

C : Mode → (τ, p, A)
where
  τ ∈ [0,1]    — temperature override
  p ∈ [0,1]    — nucleus sampling (top_p)
  A ⊆ Anchors  — auto-injected constraint set

C(know)      = (0.1, 0.3, {RequiresCitation, NoHallucination})
C(believe)   = (0.3, 0.5, {NoHallucination})
C(speculate) = (0.9, 0.95, ∅)
C(doubt)     = (0.2, 0.4, {RequiresCitation, SyllogismChecker})

This is calculated at compile time — the IR carries the resolved constraint set, so the executor applies them as zero-cost runtime overrides.

Parallel DAG Scheduling. A par block B = {b₁, ..., bₙ} where n ≥ 2 is verified at compile time to have no data dependencies between branches:

∀ bᵢ, bⱼ ∈ B, i ≠ j : deps(bᵢ) ∩ outputs(bⱼ) = ∅

At runtime, branches execute via asyncio.gather, achieving O(max(tᵢ)) latency instead of O(Σtᵢ) for sequential chains.

CPS Continuation Points. A hibernate node generates a deterministic continuation ID via SHA-256(flow_name ∥ event_name ∥ source_position). The executor serializes the full ExecutionState (call stack, step results, context variables) and halts. On resume(continuation_id), the state is deserialized and execution continues from the exact IR node — implementing Continuation-Passing Style at the language level.

II. Design Philosophy — Programming Epistemic States

Traditional LLM frameworks treat every model call identically — the same temperature, the same constraints, the same trust level. This is the equivalent of asking a human to treat brainstorming and sworn testimony with the same cognitive rigor.

AXON rejects this flat model. Epistemic Directives make the confidence state of the AI a first-class construct in the language:

know {
    flow ExtractFacts(doc: Document) -> CitedFact {
        step Verify { ask: "Extract only verifiable facts" output: CitedFact }
    }
}

speculate {
    flow Brainstorm(topic: String) -> Opinion {
        step Imagine { ask: "What could be possible?" output: Opinion }
    }
}

The compiler does not merely label these blocks — it structurally transforms them. A know block injects citation anchors and drops temperature to 0.1, making hallucination a compile-time constraint violation. A speculate block removes all constraints and raises temperature to 0.9, liberating the model.

Parallel Cognitive Dispatch mirrors how human organizations work: delegate independent analyses to specialists concurrently, then synthesize.

Dynamic State Yielding transforms agents from expensive while True loops into event-driven processes that can sleep for days, weeks, or months — then resume with full context. The language handles the serialization; the developer writes hibernate until "event_name" and moves on.

III. Real-World Use Cases

Use Case 1: Legal Document Analysis Pipeline

A law firm needs to analyze contracts with maximum factual rigor, while also exploring creative legal strategies. AXON separates these cognitive modes at the language level:

know {
    flow ExtractClauses(contract: Document) -> ClauseMap {
        step Parse { probe contract for [parties, obligations, penalties] output: ClauseMap }
    }
}

flow AnalyzeRisk(contract: Document) -> StructuredReport {
    par {
        step Financial { ask: "Analyze financial exposure" output: RiskScore }
        step Regulatory { ask: "Check regulatory compliance" output: ComplianceReport }
        step Precedent { ask: "Find relevant case law" output: CaseList }
    }
    weave [Financial, Regulatory, Precedent] into Report { format: StructuredReport }
}

speculate {
    flow ExploreStrategies(report: StructuredReport) -> Opinion {
        step Creative { ask: "What unconventional strategies could mitigate these risks?" output: Opinion }
    }
}
  • know guarantees citation-backed extraction (temperature 0.1)
  • par runs 3 analyses concurrently, reducing latency by ~3x
  • speculate explicitly relaxes constraints for creative strategy exploration

Use Case 2: Multi-Agent Research & Intelligence System

A BI platform deploys autonomous research agents that run for weeks, hibernating between data collection phases:

flow MarketIntelligence(sector: String) -> Report {
    know {
        flow GatherData(sector: String) -> DataSet {
            step Collect { ask: "Gather verified market data" output: DataSet }
        }
    }

    par {
        step Trends { ask: "Identify emerging trends" output: TrendAnalysis }
        step Competitors { ask: "Map competitor landscape" output: CompetitorMap }
    }

    hibernate until "quarterly_data_available"

    doubt {
        flow ValidateFindings(data: DataSet) -> ValidatedReport {
            step CrossCheck { ask: "Challenge every assumption with evidence" output: ValidatedReport }
        }
    }

    weave [Trends, Competitors] into Final { format: Report }
}
  • Agent hibernates after initial analysis, costing $0 while waiting
  • Resumes automatically when quarterly data arrives (webhook/cron)
  • doubt mode forces adversarial validation with syllogism checking

Use Case 3: Autonomous Customer Support with Escalation

A SaaS platform handles support tickets with different confidence requirements and automatic escalation via hibernate:

persona SupportAgent {
    domain: ["product knowledge", "troubleshooting"]
    tone: empathetic
    confidence_threshold: 0.8
}

flow HandleTicket(ticket: String) -> Resolution {
    know {
        flow DiagnoseIssue(ticket: String) -> Diagnosis {
            step Classify { ask: "Classify the issue type and severity" output: Diagnosis }
        }
    }

    believe {
        flow SuggestSolution(diagnosis: Diagnosis) -> Solution {
            step Solve { ask: "Propose a solution based on known patterns" output: Solution }
        }
    }

    if confidence < 0.7 -> hibernate until "human_review_complete"

    step Respond { ask: "Draft customer response" output: Resolution }
}
  • know classifies with strict accuracy (no guessing on severity)
  • believe suggests solutions with moderate confidence
  • Low confidence triggers hibernate — agent sleeps until a human reviews
  • Zero compute cost during human review; resumes with full context

IV. Directed Creative Synthesis — the forge Primitive

AXON introduces a sixth paradigm shift: mathematical formalization of the creative process inside LLMs.

The industry suffers from a structural limitation: LLMs can interpolate, but they struggle to _create_. forge addresses this by implementing a compiler-level Poincaré pipeline — the same 4-phase process mathematicians and scientists use when producing genuinely novel work.

Poincaré-Hadamard Creative Pipeline. A forge block orchestrates four sequential phases, each mapped to a distinct LLM configuration:

forge(seed, mode, novelty, depth, branches) → result

Phase 1: PREPARATION   — Expand the seed via context probing
Phase 2: INCUBATION    — Speculative exploration (depth iterations)
Phase 3: ILLUMINATION  — Best-of-N consensus crystallization
Phase 4: VERIFICATION  — Adversarial doubt + anchor validation

Boden Creativity Taxonomy. The mode parameter maps Margaret Boden's three creativity types to concrete LLM parameter overrides at compile time:

B : Mode → (τ, freedom, rule_flexibility)

B(combinatory)      = (0.9,  0.8, 0.3)   — novel recombination of known ideas
B(exploratory)      = (0.7,  0.6, 0.5)   — structured navigation of possibility spaces
B(transformational) = (1.2,  1.0, 0.9)   — rule-breaking synthesis, new paradigms

Novelty Operator K(x|K). The novelty parameter (0.0–1.0) controls the Kolmogorov-inspired tradeoff between utility and surprise. It blends into the effective temperature used during incubation:

τ_eff = τ_base × (0.5 + 0.5 × novelty)

novelty = 0.0 → τ_eff = 0.5 × τ_base  (conservative, high utility)
novelty = 1.0 → τ_eff = 1.0 × τ_base  (maximum divergence, high surprise)

Usage example — Directed Creative Synthesis:

anchor GoldenRatio {
    require: aesthetic_harmony
    confidence_floor: 0.70
}

flow CreateVisualConcept(brief: String) -> Visual {
    forge Artwork(seed: "aurora borealis over ancient ruins") -> Visual {
        mode:        transformational
        novelty:     0.85
        constraints: GoldenRatio
        depth:       4
        branches:    7
    }
}

run CreateVisualConcept("Create a visual concept for a film poster")

What the compiler does:

  1. Preparation — expands "aurora borealis over ancient ruins" into a rich

conceptual foundation via context probing

  1. Incubation — runs 4 iterations of speculative exploration at

τ_eff = 1.2 × 0.925 = 1.11, pushing beyond obvious associations

  1. Illumination — launches 7 parallel branches, each crystallizing the

incubated ideas, then selects the most coherent output (Best-of-N)

  1. Verification — applies adversarial doubt against the GoldenRatio

anchor, validating that the result is genuinely novel (K(x|K) > 0) and aesthetically balanced

This is not a prompt template. The forge primitive compiles to structured IR metadata that the runtime executes as an orchestrated pipeline — the same precision AXON applies to every other cognitive primitive.

V. Autonomous Goal-Seeking — the agent Primitive

AXON introduces a seventh paradigm shift: compiler-verified autonomous agents grounded in the Belief-Desire-Intention (BDI) architecture, epistemic logic, and coinductive semantics.

Every existing LLM framework implements agents as Python classes with ad-hoc while-loops, hidden state machines, and zero formal guarantees. LangChain's AgentExecutor is a runtime artifact — it cannot be statically analyzed, type- checked, or budget-bounded at compile time. AXON's agent primitive makes autonomous goal-seeking a first-class compiled construct with mathematical semantics.

BDI Coinductive Semantics. An agent declaration compiles to a coinductive BDI system — a state machine whose behavior is defined by an infinite observation/transition pair over the epistemic lattice:

Agent ≅ ν X. (S × (Action → X))

where
  S        = Beliefs × Goals × Plans    — cognitive state
  Action   = Observe | Deliberate | Act | Reflect
  ν        = greatest fixpoint (coinduction — runs indefinitely)

The ν (nu) operator is the key: unlike inductive data (finite trees), a coinductive agent is a potentially infinite stream of state transitions, terminating only when the goal is achieved or a budget is exhausted. This formalization is not decorative — it determines the compiler's verification strategy and the executor's loop semantics.

Epistemic Lattice Convergence. At each BDI cycle, the agent's epistemic state is projected onto the same lattice (T, ≤) used by epistemic directives. The deliberation phase produces a state σ ∈ {know, believe, speculate, doubt} and a boolean goal_achieved. The convergence criterion is:

Converge(σ, g) = g = true ∧ σ ≥ believe

Diverge(σ, i, n) = σ = doubt ∧ Δσ = 0 ∧ i ≥ n
  where
    Δσ       = σᵢ - σᵢ₋₁   — epistemic progress between cycles
    i        = current iteration
    n        = stuck_window  — consecutive stagnation threshold

When Converge fires, the agent terminates successfully. When Diverge fires, the on_stuck recovery policy activates — escalate raises AgentStuckError, forge triggers creative re-seeding via the Poincaré pipeline, retry resets and re-attempts.

Budget Composition. Budget constraints compose from the IR into the runtime as a 4-tuple verified at compile time:

B(agent) = (max_iter, max_tokens, max_time, max_cost)

Terminate when: ∃ b ∈ B(agent) : consumed(b) ≥ limit(b)

The compiler rejects agents with unbounded budgets (max_iterations = 0 without an explicit on_stuck policy), preventing runaway execution by construction.

Strategy Dispatch. The strategy parameter selects the BDI loop variant at compile time. Each strategy maps to a specific deliberation/action sequence:

Λ : Strategy → CycleShape

Λ(react)            = Deliberate → Act → Observe
Λ(reflexion)        = Deliberate → Act → Observe → Reflect
Λ(plan_and_execute) = Plan → (Act → Observe)* → Verify
Λ(custom)           = user-defined step sequence

Usage example — Autonomous Research Agent:

persona ResearchAnalyst {
    domain: ["market research", "competitive analysis"]
    tone: analytical
    confidence_threshold: 0.85
}

tool WebSearch {
    provider: serper
    timeout: 10s
}

tool DataAnalyzer {
    provider: internal
    timeout: 30s
}

agent MarketResearcher {
    goal: "Produce a comprehensive competitive analysis report
           with verified data from at least 5 sources"
    tools: [WebSearch, DataAnalyzer]
    strategy: react
    max_iterations: 15
    max_tokens: 50000
    max_cost: 2.50
    on_stuck: forge
    return: CompetitiveReport
}

flow CompetitiveIntelligence(sector: String) -> CompetitiveReport {
    step Research {
        MarketResearcher(sector)
        output: CompetitiveReport
    }
}

run CompetitiveIntelligence("electric vehicles")
    with ResearchAnalyst

What the compiler does:

  1. IR Generation — the agent block compiles to an IRAgent node containing

goal, tools, budget (15 iter / 50k tokens / $2.50), strategy (react), and recovery policy (forge). The IRAgent is embedded as a step inside IRFlow, preserving compositional semantics.

  1. Backend Compilation — the backend (Anthropic, Gemini) generates a

CompiledStep with step_name: "agent:MarketResearcher" and full agent metadata in its metadata["agent"] dictionary. The system prompt includes persona traits, tool availability, and epistemic constraints.

  1. Runtime Execution — the executor detects agent: prefix and dispatches

to the BDI loop. Each cycle: deliberate (epistemic assessment via JSON), act (execute step or invoke tool), observe (update beliefs). The loop respects the budget 4-tuple and applies on_stuck when Diverge fires.

  1. Trace Events — every BDI cycle emits STEP_START, MODEL_CALL, and

STEP_END trace events, giving full observability into the agent's reasoning trajectory.

Why this matters: The agent is not a Python class that wraps while True. It is a compiled cognitive primitive — the compiler verifies its budget boundedness, the type checker validates its return type, the backend generates strategy-specific prompts, and the runtime executes a formally-defined BDI loop with epistemic convergence criteria. This is the difference between duct-taping an LLM into a loop and engineering an autonomous system with mathematical guarantees.

Agent Use Case 1: Autonomous Legal Research Agent

A law firm deploys an agent that autonomously researches case law until it finds sufficient precedent — or exhausts its budget and escalates to a human attorney:

agent CaseLawResearcher {
    goal: "Find 3+ relevant precedents for the contract dispute
           with verified court citations"
    tools: [WebSearch, PDFExtractor]
    strategy: reflexion
    max_iterations: 20
    max_cost: 5.00
    on_stuck: escalate
    return: CaseLawReport
}
  • reflexion strategy adds self-critique after each cycle — the agent evaluates

whether its found precedents are truly relevant, not just keyword matches

  • on_stuck: escalate means if the agent doubts its findings after 20 cycles,

it raises AgentStuckError with full context, so the human reviews exactly where the agent got stuck

  • Budget cap of $5.00 prevents runaway API costs — the compiler guarantees

termination

Agent Use Case 2: Multi-Agent Data Pipeline

A BI platform chains two agents: one gathers data, the other analyzes it. Both execute within the same compiled flow:

agent DataGatherer {
    goal: "Collect quarterly revenue data from public filings"
    tools: [WebSearch, FileReader]
    strategy: react
    max_iterations: 10
    on_stuck: retry
    return: DataSet
}

agent TrendAnalyzer {
    goal: "Identify year-over-year growth patterns and anomalies"
    tools: [Calculator, DataAnalyzer]
    strategy: plan_and_execute
    max_iterations: 8
    on_stuck: forge
    return: TrendReport
}

flow QuarterlyIntelligence(sector: String) -> TrendReport {
    step Gather { DataGatherer(sector) output: DataSet }
    step Analyze { TrendAnalyzer(Gather.output) output: TrendReport }
}
  • Two agents, two strategies: react for data gathering (fast, tool-heavy),

plan_and_execute for analysis (structured, plan-then-verify)

  • Each agent has independent budget tracking — if DataGatherer costs $0.50,

TrendAnalyzer still has its full budget

  • If TrendAnalyzer gets stuck, forge triggers creative re-seeding via the

Poincaré pipeline, generating novel analytical angles

Agent Use Case 3: Customer Onboarding Agent with Dynamic Recovery

A SaaS platform uses an agent to guide new customers through a personalized onboarding flow, adapting when it gets stuck:

persona OnboardingSpecialist {
    domain: ["product knowledge", "user experience"]
    tone: warm
    confidence_threshold: 0.80
}

agent OnboardingGuide {
    goal: "Complete the customer's onboarding checklist with
           personalized recommendations for their industry"
    tools: [APICall, Calculator]
    strategy: custom
    max_iterations: 12
    max_tokens: 30000
    on_stuck: forge
    return: OnboardingReport

    step Greet { ask: "Welcome the user and assess their goals" }
    step Configure { ask: "Recommend workspace configuration" }
    step Train { ask: "Generate personalized tutorial sequence" }
}
  • custom strategy: the agent follows a user-defined step sequence (Greet →

Configure → Train), not a generic loop

  • on_stuck: forge — if the agent can't personalize recommendations (e.g.,

unknown industry), it triggers creative synthesis to propose novel onboarding paths instead of failing

  • The return: OnboardingReport type is validated by the semantic type checker

— the agent must produce a structurally valid report, not just free text

VI. Compile-Time Security — the shield Primitive

AXON introduces an eighth paradigm shift: Information Flow Control (IFC) as a first-class compiled construct, providing compile-time security guarantees against LLM-specific attack vectors.

Every LLM framework treats security as an afterthought — runtime guardrails bolted on top of applications. AXON's shield primitive makes security a compiler-verified property of your program, grounded in taint analysis and Information Flow Control theory.

Trust Lattice (Denning-style IFC). The shield system operates over a trust lattice where data flows from untrusted sources through shield application points to trusted sinks. The compiler statically verifies that every path from an untrusted source to a trusted sink passes through at least one shield:

U : DataLabel → TrustLevel

TrustLevel = Untrusted < Scanned < Sanitized < Trusted

∀ path(source, sink) ∈ Flow :
  label(source) = Untrusted ∧ label(sink) = Trusted
  → ∃ shield ∈ path : label(shield.output) ≥ Sanitized

Threat Taxonomy. The scan field declares which threats the shield detects, drawn from a formal taxonomy of 11 LLM attack categories:

T = { prompt_injection, jailbreak, data_exfil, pii_leak, toxicity,
      bias, hallucination, code_injection, social_engineering,
      model_theft, training_poisoning }

Detection Strategies. The strategy parameter selects the detection mechanism, each with different cost/accuracy tradeoffs:

Σ : Strategy → (Cost, Accuracy, Latency)

Σ(pattern)     = (low,    medium, fast)     — regex/heuristic scan
Σ(classifier)  = (medium, high,   medium)   — fine-tuned classifier (Llama Guard)
Σ(dual_llm)    = (high,   highest, slow)    — privileged/quarantined model pair
Σ(canary)      = (low,    medium, fast)     — traceable token injection
Σ(perplexity)  = (medium, high,   medium)   — statistical anomaly detection
Σ(ensemble)    = (high,   highest, slow)    — majority voting across multiple strategies

Capability Enforcement. The compiler statically verifies that agent tool access is a subset of the shield's allow list — preventing privilege escalation at compile time:

∀ agent A with shield S :
  tools(A) ⊆ allow_tools(S)    — verified at compile time
  tools(A) ∩ deny_tools(S) = ∅  — also verified

Usage example — LLM Input Shield:

shield InputGuard {
    scan: [prompt_injection, jailbreak, pii_leak]
    strategy: dual_llm
    on_breach: halt
    severity: critical
    allow: [web_search, calculator]
    deny: [code_executor]
    sandbox: true
    redact: [email, phone]
    confidence_threshold: 0.85
}

persona SecureAssistant {
    domain: ["customer support"]
    tone: professional
    confidence_threshold: 0.80
}

agent SecureBot {
    goal: "Answer customer queries safely"
    tools: [web_search, calculator]
    shield: InputGuard
    strategy: react
    max_iterations: 10
    return: SafeResponse
}

flow SecureSupport(query: String) -> SafeResponse {
    shield InputGuard on query -> SanitizedQuery
    step Process {
        SecureBot(SanitizedQuery)
        output: SafeResponse
    }
}

run SecureSupport("Help me with my account")
    with SecureAssistant

What the compiler does:

  1. Type Checking — validates all scan categories, strategies, breach

policies, severity levels, and confidence thresholds. Detects allow/deny overlaps and invalid configurations at compile time.

  1. Capability Enforcement — verifies that SecureBot only uses

[web_search, calculator] which are in InputGuard.allow, and that neither appears in deny. If SecureBot tried to use code_executor, the compiler would reject the program.

  1. Taint Analysis — verifies that query (untrusted) passes through

shield InputGuard on query before reaching the agent's trusted context.

  1. Runtime Execution — the shield step emits SHIELD_SCAN_START,

scans for prompt injection/jailbreak/PII, and either passes (SHIELD_SCAN_PASS) or raises ShieldBreachError (SHIELD_SCAN_BREACH).

Shield Use Case 1: Financial Data Pipeline with PII Redaction

shield DataShield {
    scan: [pii_leak, data_exfil]
    strategy: classifier
    on_breach: sanitize_and_retry
    max_retries: 3
    severity: high
    redact: [ssn, credit_card, bank_account]
}

flow ProcessFinancialQuery(input: String) -> Report {
    shield DataShield on input -> CleanInput
    step Analyze {
        given: CleanInput
        ask: "Analyze the financial data"
        output: Report
    }
}
  • PII fields (SSN, credit card, bank account) are auto-redacted before the

LLM sees the data

  • sanitize_and_retry means detected threats are cleaned and re-scanned up to

3 times, not just blocked

  • The compiler guarantees the LLM never processes raw PII

Shield Use Case 2: Multi-Agent System with Capability Isolation

shield ResearchShield {
    scan: [data_exfil, model_theft]
    strategy: ensemble
    on_breach: quarantine
    allow: [web_search, file_reader]
    deny: [code_executor, api_call]
    sandbox: true
}

agent Researcher {
    goal: "Gather market intelligence from public sources"
    tools: [web_search, file_reader]
    shield: ResearchShield
    strategy: reflexion
    max_iterations: 15
    return: IntelligenceReport
}
  • ensemble strategy runs multiple detectors with majority voting — highest

accuracy for sensitive operations

  • sandbox: true runs tool execution in an isolated environment
  • Capability enforcement: the compiler rejects any agent that tries to use

code_executor or api_call — preventing privilege escalation by design

  • quarantine breach policy isolates suspicious data for human review instead

of blocking operations

VII. Epistemic Tool Fortification — Streaming, Effects & Blame Semantics

AXON introduces a ninth paradigm shift: formal epistemic control over tool invocations, streaming outputs, and foreign-function interfaces — backed by algebraic effect theory, coinductive stream semantics, and Findler-Felleisen blame calculus. The stream primitive decouples pure deliberation from the I/O mechanism.

The Hard Argument (Computational Decoupling)

In pragmatic software engineering, Python generators (yield and async for) have become the standard for data streaming. However, under the rigor of formal language theory and category mathematics, this approach has a structural flaw: it inextricably couples "deliberation" (data generation) with the "I/O mechanism" (transmission). AXON resolves this by applying Algebraic Effects and Handlers to streaming. The stream primitive no longer executes I/O; it yields a pure effect (YieldChunk(data)), suspending the continuation k. An external Handler (e.g., SSEHandler) intercepts the effect, executes the I/O side-effect, and then resumes k. This mathematical decoupling ensures the generative core remains functionally pure and independently testable.

The Sweet Argument (Why it's awesome)

Imagine writing streaming logic without ever worrying about the HTTP connection! With the renewed stream primitive, your AI agents don't "push bytes"—they express pure conceptual intentions. You just write your LLM generation logic in the cleanest way possible. Want to switch from Server-Sent Events (SSE) to WebSockets, or maybe just log to a file? The agent code doesn't change a single character! You simply swap the Handler. Your codebase becomes incredibly pristine, blazingly fast to test, and theoretically invincible. It makes streaming feel like pure magic backed by hardcore category theory.

Real-World Use Cases

  1. Agentic Server-Sent Events (SSE): Stream an agent's intermediate "thoughts" and reasoning steps directly to a React frontend in real-time. If the client drops the connection, the handler manages the disconnection gracefully without crashing the agent's pure deliberation cycle.
  2. Multi-Channel Orchestration: A single stream computation can be intercepted by a composite handler that simultaneously prints chunks to a CLI, broadcasts to an SSE channel, and persists the flow to a Redis database—all while the business logic remains fully unaware of these I/O burdens.
  3. Deterministic Testing Pipelines: In your CI/CD pipelines, the I/O handler can be instantly swapped out for a MockHandler that accumulates chunks synchronously in memory. This eliminates flaky network-bound streaming tests entirely, allowing you to test complex LLM streaming flows in microseconds.

Every LLM framework treats tool calls as black boxes: a function returns a string, and the framework trusts it unconditionally. Streaming is even worse — partial tokens arrive without any notion of confidence, reliability, or epistemic state. AXON solves this by making every interaction with the external world subject to formal epistemic tracking.

Formal Model — Four Convergence Theorems

CT-1: Coinductive Semantic Streaming. A streaming response is a coinductive process — an infinite observation/transition pair that monotonically accumulates epistemic confidence as chunks arrive:

Stream(τ) = νX. (StreamChunk × EpistemicState × X)

where
  StreamChunk    = (content: String, index: ℕ, timestamp: ℝ)
  EpistemicState = (level ∈ {doubt, speculate, believe, know}, confidence ∈ [0,1])
  ν              = greatest fixpoint (coinduction — process unfolds indefinitely)

Monotonicity invariant:
  ∀ i < j : gradient(chunkᵢ) ⊑ gradient(chunkⱼ)
  (epistemic level can only rise, never degrade during streaming)

Streaming in AXON is not "tokens arriving". It is a formal epistemic process: each chunk carries its position on the lattice, and the system guarantees that confidence can only increase monotonically until convergence.

CT-2: Algebraic Effect Rows. Every tool declares its computational effects using Plotkin & Pretnar's algebraic effect theory. The compiler statically verifies effect compatibility:

EffectRow(tool) = ⟨ε₁, ε₂, ..., εₙ, epistemic:level⟩

where
  εᵢ ∈ {pure, io, network, storage, random}
  level ∈ {know, believe, speculate, doubt}

Composition rule:
  EffectRow(A ∘ B) = EffectRow(A) ∪ EffectRow(B)
  epistemic(A ∘ B) = min(epistemic(A), epistemic(B))   — meet on lattice

The composition rule means: if you chain a network + speculate tool with a pure + know tool, the combined effect is network + speculate — the system automatically tracks the least trustworthy component.

CT-3: Blame Semantics for FFI. External tool calls are wrapped in Findler-Felleisen contract monitors that assign blame when pre/postconditions fail:

ContractMonitor(tool) = (Pre, Post, Blame)

where
  Pre  : Input → Bool         — caller's obligation
  Post : Output → Bool        — server's obligation
  Blame : {CALLER, SERVER}    — who violated the contract

Blame assignment:
  ¬Pre(input)   → Blame = CALLER   (you sent bad data)
  ¬Post(output) → Blame = SERVER   (tool returned bad data)

This is not error handling — this is formal accountability. When a tool fails, AXON tells you who broke the contract, not just that it broke.

CT-4: Epistemic Inference via CSP. The @csp_tool decorator automatically infers the epistemic level of any Python function by analyzing its effect footprint using a constraint-satisfaction heuristic:

Infer(f) : Function → EpistemicLevel

  If ∄ io/network/random ∈ effects(f) → know
  If ∃ network ∈ effects(f)           → speculate
  If ∃ random ∈ effects(f)            → doubt
  Otherwise                           → believe

What Makes This Revolutionary

No LLM framework in existence tracks what a tool does to your epistemic state. LangChain, CrewAI, AutoGen — they all treat tool results as trusted strings. This means:

  • A web search result (unreliable) gets the same trust as a database query

(reliable)

  • A streaming response's first token gets the same trust as the final,

validated output

  • When a tool fails, you don't know if your input was wrong or the tool was

broken

AXON solves all three. The compiler guarantees that:

  1. Every tool call is tagged with its effect signature and epistemic level
  2. Streaming outputs start at doubt and can only ascend monotonically
  3. Tool failures carry blame labels that identify the responsible party
  4. Data crossing the FFI boundary is automatically tainted — it cannot

reach know level without passing through a shield or anchor

Use Case 1: Real-Time Financial Streaming with Epistemic Gradient

A trading desk receives streaming market data and needs to distinguish between real-time quotes (speculative) and confirmed trades (factual):

tool MarketFeed {
    provider: bloomberg
    timeout: 5s
    effects: <io, network, epistemic:speculate>
}

flow MonitorMarket(sector: String) -> MarketReport {
    step Stream {
        stream<QuoteData> {
            on_chunk: {
                probe chunk for [symbol, price, volume]
                output: QuoteSnapshot
            }
            on_complete: {
                validate QuoteSnapshot against: MarketSchema
                output: VerifiedQuote
            }
        }
    }
    step Analyze {
        reason {
            given: Stream.output
            ask: "Identify anomalous price movements"
            depth: 2
        }
        output: MarketReport
    }
}
  • Each streaming chunk starts at doubt — the system treats partial data as

unreliable by default

  • on_complete handler validates and promotes to believe — only complete,

schema-validated data upgrades

  • The effects: <io, network, epistemic:speculate> declaration means the

compiler knows this tool is never factual — preventing accidental know-level assertions from market data

Use Case 2: Multi-Tool Research Agent with Blame Tracking

A research agent uses multiple tools with different reliability levels. When something fails, the system identifies exactly who broke the contract:

tool WebSearch {
    provider: serper
    timeout: 10s
    effects: <network, epistemic:speculate>
}

tool DatabaseQuery {
    provider: internal
    timeout: 30s
    effects: <io, epistemic:believe>
}

tool Calculator {
    provider: stdlib
    effects: <pure, epistemic:know>
}

flow DeepResearch(question: String) -> ResearchReport {
    par {
        step Web {
            use_tool WebSearch with query: question
            output: WebResults
        }
        step DB {
            use_tool DatabaseQuery with query: question
            output: DBResults
        }
    }
    step Synthesize {
        weave [Web.output, DB.output]
        output: ResearchReport
    }
}
  • WebSearch is epistemic:speculate — the compiler knows web results are

unreliable and automatically taints downstream data

  • DatabaseQuery is epistemic:believe — more reliable, but still not know

because external I/O is involved

  • Calculator is pure + epistemic:know — no side effects, deterministic,

fully trustworthy

  • When weave combines them, the result's epistemic level is

min(speculate, believe) = speculate — the weakest link determines trust

  • If WebSearch returns garbage, the ContractMonitor issues

Blame = SERVER with full diagnostic context

Use Case 3: Safe External API Integration with @contract_tool

A production system integrates a third-party payment API. The @contract_tool decorator wraps it with pre/postcondition contracts and automatic epistemic downgrade:

from axon.runtime.tools import contract_tool

@contract_tool(
    pre=lambda amount, currency: amount >

Related MCP servers

Browse all →