ControlKeel
     
Turn the way your team works into enforceable memory for AI agents. - @arya_minus
ControlKeel is an agent control plane for day-to-day governed engineering. Through observation, findings and evaluation, it learns your intent rules, review taste and delivery habits, turning them into typed memory, policy checks and proof bundles. CK sits between your coding agents and production as a portable "company brain": comparing intended delivery against actual delivery and turning raw agent intent into policy-validated tasks.
If you're using an AI agent today, you probably have an .md telling it how to behave. But a rules/specs file is just a promise made to* the model. ControlKeel enforces the output. Beyond just catching bugs, CK solves the "Unknown Unknowns" problem: having to re-explain your domain knowledge in every single session.
Product loop
- Capture intent and policy — scope, risk, budget, domain pack, and human taste become CK state.
- Validate agent output — deterministic checks and optional advisory review produce findings before risky work reaches main.
- Gate only when needed — humans approve high-impact actions when intent, risk, or policy requires it.
- Persist evidence — findings, reviews, proofs, memory, cost, and task outcomes survive host switches.
- Improve with evals — traces and recurring failures become bounded regression evidence for specific suites and subjects.
ControlKeel transforms your domain knowledge from "raw" intent and "shelfware" documentation into a living system that remembers, enforces, and evolves.
Quick start
One-line setup via your agent
Copy/paste this into your agent (OpenCode, Codex, Claude, or another supported host):
Set up ControlKeel for this repository. Read and follow https://raw.githubusercontent.com/aryaminus/controlkeel/main/README.md, https://raw.githubusercontent.com/aryaminus/controlkeel/main/docs/getting-started.md, https://raw.githubusercontent.com/aryaminus/controlkeel/main/docs/support-matrix.md, and https://raw.githubusercontent.com/aryaminus/controlkeel/main/docs/agent-integrations.md. Install ControlKeel if missing, run `controlkeel setup`, detect this agent host, attach the strongest supported path with `controlkeel attach <host>`, then run `controlkeel attach doctor`, `controlkeel provider doctor`, `controlkeel status`, `controlkeel findings`, and the host-native MCP check. If CK is available only as MCP, call `ck_attach` for this host. Apply only safe local fixes and redact secrets from logs. Pause and ask before continuing if the host needs workspace trust, manual provider configuration, a restart after attach/plugin changes, or a plan-review approval that cannot auto-wait. Ensure the project is trusted and restart the host after attach/plugin changes.
CLI install
Install the CLI:
brew tap aryaminus/controlkeel && brew install controlkeel
# or
npm i -g @aryaminus/controlkeel
# or
curl -fsSL https://github.com/aryaminus/controlkeel/releases/latest/download/install.sh | sh
Windows PowerShell:
irm https://github.com/aryaminus/controlkeel/releases/latest/download/install.ps1 | iex
First governed run:
controlkeel
controlkeel setup
controlkeel attach opencode # or another supported host
controlkeel attach doctor
controlkeel provider doctor
controlkeel status
controlkeel findings
For the complete first-run path, use docs/getting-started.md. For host truth, use docs/support-matrix.md and docs/agent-integrations.md.
Benchmark-backed evidence
ControlKeel includes a persisted benchmark engine. Current user-facing evidence is bounded to the named suite, subject, and scoring definition below; docs/benchmarks.md is the canonical reference for full tables, caveats, JSON exports, and agent-host protocols.
Verified with-vs-without-CK baseline (host_comparison_v1, 12 risky scenarios)
Verified with ControlKeel 0.3.45:
- Risky suite
host_comparison_v1:ungoverned_baselinecaught 0/12;controlkeel_validatecaught 12/12, blocked 9/12, and hit expected rules 9/12 with median deterministic validation time 52 ms, 0 provider tokens. - Paired benign suite
benign_baseline_v1:controlkeel_validateproduced 0/10 catches, 0/10 blocks, FPR 0.000, median deterministic validation time 42 ms, 0 provider tokens.
Read the numbers precisely: deterministic scanner evidence is not the same as model-backed agent-host evidence. Reproduction commands and the OpenCode/Copilot/Claude/Codex comparison protocol live in docs/benchmarks.md.
What ships today
- Local governance: CLI, stdio MCP, project binding, host attach/export bundles, scanner validation, findings, reviews, proof bundles, budgets, and typed memory.
- Host and runtime support: native attach for supported hosts, runtime exports for headless/outer-loop systems, hosted MCP/minimal A2A, and fallback validation/proxy paths.
- Team/project operations: org membership, invitations, OIDC/SAML auth surfaces, workspace GitHub repo bindings, service accounts, webhooks, workspace tool policy, and policy-set APIs.
- Cloud evidence paths: opt-in cloud telemetry, workspace keys, cloud run packages, runtime callbacks, and dormant-until-configured bidirectional sync for findings, reviews, digests, and memory records.
- Observability loop: timelines, memory quality, costs, trends, problem clusters, eval candidates, benchmark drafts/history, and promotion advisories.
Docs map
- docs/README.md — documentation map by job
- docs/getting-started.md — install to first finding
- docs/support-matrix.md — canonical host/protocol inventory
- docs/agent-integrations.md — integration mechanisms and support tiers
- docs/benchmarks.md — benchmark scoring, metadata, and claim discipline
- docs/observability-feedback-loop.md — local evidence-to-regression loop
- docs/control-plane-claim-matrix.md — README claim-to-test matrix for governance, memory, cloud sync, and human gates
- docs/api-reference.md and docs/cli-reference.md — code-aligned surfaces
- docs/packages.md — package and distribution catalog
- docs/self-hosting.md — self-host deployment guidance
Development
mix setup
mix phx.server
mix test
mix precommit
Phoenix + Ecto on SQLite. Uses Req for HTTP. Single-binary builds ship through Burrito and GitHub Releases.






