Boyce: Semantic Protocol & Safety Layer for Agentic Database Workflows
<!-- mcp-name: io.github.boyce-io/boyce -->
The semantic safety layer for agentic database workflows. Boyce connects LLMs to live database context with built-in safety rails.
Named for Raymond F. Boyce, co-inventor of SQL (1974) and co-author of Boyce-Codd Normal Form (BCNF).
AI agents querying databases without proper context generate unreliable SQL — working from incomplete schemas, inferring column names, guessing join paths. Boyce gives agents the structured database intelligence they need to generate correct, safe SQL every time — through three interconnected systems:
| Layer | What it does | |---|---| | SQL Compiler | ask_boyce — NL → StructuredFilter → deterministic SQL. Zero LLM in the SQL builder. Same inputs, same SQL, byte-for-byte, every time. | | Database Inspector | query_database / profile_data — Live Postgres/Redshift adapters let your agent see real schema and real data distributions before writing a single filter. | | Query Verification | Pre-flight EXPLAIN loops on every generated query. Bad SQL is caught at planning time, not at 2am in your on-call rotation. |
Why does this matter? → The Null Trap: Your AI Agent's SQL Is Correct. The Answer Is Still Wrong.
---
Install
Requires Python 3.10+
pip install boyce
# With live Postgres/Redshift adapter (enables EXPLAIN pre-flight + column profiling)
pip install "boyce[postgres]"
# uv (recommended)
uv pip install boyce
uv pip install "boyce[postgres]"
From source: ``bash git clone https://github.com/boyce-io/boyce uv pip install -e "boyce/" ``
---
Quickstart
After installing, run boyce init to configure your MCP host automatically:
boyce init
The wizard detects Claude Desktop, Cursor, Claude Code, and JetBrains (DataGrip, IntelliJ, etc.), and writes the correct config block for each.
Developing from source? The repo includes a setup script:
./quickstart.sh # detects uv or python, installs package, writes .env template
---
Configure Your MCP Host
The fastest path is boyce init — it detects your MCP host and writes the config automatically:
boyce init
Or configure manually. There are two setup paths depending on your host:
---
Path 1 — MCP Hosts (No LLM key required)
If you're using Claude Desktop, Cursor, Claude Code, Codex, Cline, Windsurf, JetBrains (DataGrip, IntelliJ), or any MCP-compatible host, you do not need to configure an LLM provider for Boyce. The host's own model handles reasoning — Boyce supplies the schema context and deterministic SQL compiler via get_schema and ask_boyce. Only BOYCE_DB_URL is needed (and even that is optional).
Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"boyce": {
"command": "boyce",
"env": {
"BOYCE_DB_URL": "postgresql://user:pass@host:5432/db"
}
}
}
}
Cursor (.cursor/mcp.json in project root):
{
"mcpServers": {
"boyce": {
"command": "boyce",
"env": {
"BOYCE_DB_URL": "postgresql://user:pass@host:5432/db"
}
}
}
}
---
Path 2 — With Boyce's Built-in NL→SQL
If you're using the CLI (boyce ask), HTTP API, or a non-MCP client (e.g., the VS Code extension), configure Boyce's internal query planner with your LLM provider:
{
"mcpServers": {
"boyce": {
"command": "boyce",
"env": {
"BOYCE_PROVIDER": "anthropic",
"BOYCE_MODEL": "claude-sonnet-4-6",
"ANTHROPIC_API_KEY": "sk-ant-...",
"BOYCE_DB_URL": "postgresql://user:pass@host:5432/db"
}
}
}
}
Boyce supports any LLM provider available through LiteLLM: Anthropic, OpenAI, Ollama (local), vLLM (local), Azure, Bedrock, Vertex, Mistral, and more.
---
BOYCE_DB_URL is optional on both paths. Without it, Boyce runs in schema-only mode — SQL generation still works; EXPLAIN pre-flight and live query tools return "status": "unchecked".
---
Environment Variables
| Variable | When needed | Example | Purpose | |---|---|---|---| | BOYCE_PROVIDER | Path 2 only (CLI/HTTP/non-MCP) | anthropic | LiteLLM provider name | | BOYCE_MODEL | Path 2 only (CLI/HTTP/non-MCP) | claude-sonnet-4-6 | Model ID passed to LiteLLM | | ANTHROPIC_API_KEY | When using Anthropic | sk-ant-... | Anthropic credentials | | OPENAI_API_KEY | When using OpenAI | sk-... | OpenAI credentials | | BOYCE_DB_URL | Optional (either path) | postgresql://user:pass@host:5432/db | asyncpg DSN — enables EXPLAIN pre-flight + live query tools | | BOYCE_HTTP_TOKEN | Path 2 HTTP API only | my-secret-token | Bearer token for boyce serve --http | | BOYCE_STATEMENT_TIMEOUT_MS | Optional | 30000 | Per-statement timeout in ms (default: 30s) |
---
MCP Tools
| Tool | Description | |---|---| | ingest_source | Parse a SemanticSnapshot from dbt manifest, dbt project, LookML, DDL, SQLite, Django, SQLAlchemy, Prisma, CSV, or Parquet. | | ingest_definition | Store a certified business definition — injected automatically at query time. | | get_schema | Return full schema context + StructuredFilter format docs. Used by MCP hosts so the host LLM can construct queries without a Boyce API key. | | ask_boyce | Full NL → SQL pipeline: query planner (LiteLLM) → deterministic kernel → NULL trap check → EXPLAIN pre-flight. | | validate_sql | Validate hand-written SQL — EXPLAIN pre-flight, Redshift lint, NULL risk — without executing. | | query_database | Execute a read-only SELECT against the live database. Write operations rejected at two independent layers. | | profile_data | Null %, distinct count, min/max for any column — surface data quality issues before they affect query results. | | check_health | Operational health check — DB connectivity, snapshot freshness, actionable fix commands. Call when queries fail unexpectedly. |
---
Architecture
SemanticSnapshot (JSON)
│
▼ ingest_source
┌─────────────────────────────────────────────┐
│ SemanticGraph (NetworkX) │ ← in-memory, loaded per session
│ nodes = entities (tables/views/dbt models) │
│ edges = joins (weighted by confidence) │
└─────────────────────────────────────────────┘
│ │
▼ ask_boyce ▼ (internal)
QueryPlanner Dijkstra
(LiteLLM) join resolver
NL → StructuredFilter │
│ │
└──────────┬────────────────┘
▼
kernel.process_request() ← ZERO LLM HERE
SQLBuilder (dialect-aware)
│
▼
EXPLAIN pre-flight ← Query Verification
(PostgresAdapter)
│
▼
SQL + validation result
Dialect support: redshift, postgres, duckdb, bigquery
Redshift safety rails (safety.py): Automatic linting for LATERAL, JSONB, REGEXP_COUNT, lookahead regex patterns, and numeric cast rewrites for Redshift 1.0 (PG 8.0.2).
---
Scan CLI
# Scan a single file
boyce scan demo/magic_moment/manifest.json
# Scan a directory (auto-detects all parseable sources)
boyce scan ./my-project/ -v
# Save snapshots for MCP server use
boyce scan ./my-project/ --save
10 parsers: dbt manifest, dbt project, LookML, SQLite, DDL, CSV, Parquet, Django, SQLAlchemy, Prisma.
---
Verify the Install
# Unit tests — no DB required, runs in ~4 seconds
python boyce/tests/verify_eyes.py
# Expected output:
# Ran 15 tests in 3.5s
# OK
# ✅ All checks passed.
---
SemanticSnapshot Format
The ingest_source tool accepts a SemanticSnapshot JSON dict. Minimal example:
{
"snapshot_id": "<sha256>",
"source_system": "dbt",
"entities": {
"entity:orders": {
"id": "entity:orders",
"name": "orders",
"schema": "public",
"fields": ["field:orders:order_id", "field:orders:revenue"]
}
},
"fields": {
"field:orders:order_id": {
"id": "field:orders:order_id",
"entity_id": "entity:orders",
"name": "order_id",
"field_type": "ID",
"data_type": "INTEGER"
}
},
"joins": []
}
See boyce/tests/live_fire/mock_snapshot.json for a complete field/entity example.
---
Project Layout
boyce/ ← PRIMARY — headless FastMCP server + pip package
├── boyce/
│ ├── server.py ← MCP entry point (8 tools)
│ ├── kernel.py ← Deterministic SQL kernel
│ ├── graph.py ← SemanticGraph (NetworkX)
│ ├── safety.py ← Redshift compatibility rails
│ ├── types.py ← Protocol contract (Pydantic)
│ ├── scan.py ← Scan CLI (boyce scan)
│ ├── connections.py ← DSN persistence (ConnectionStore)
│ ├── doctor.py ← Environment diagnostics (boyce doctor)
│ ├── sql/ ← SQLBuilder, dialect layer, join resolver
│ ├── parsers/ ← 10 parsers (dbt, lookml, ddl, sqlite, csv, etc.)
│ ├── planner/ ← QueryPlanner (LiteLLM → StructuredFilter)
│ └── adapters/ ← PostgresAdapter (Eyes)
└── tests/
├── verify_eyes.py ← 15-test suite, no DB required
├── test_parsers.py ← Parser tests (all 10 parsers)
├── test_scan.py ← Scan CLI tests
└── live_fire/ ← Docker Compose integration tests
---
Status
| Capability | Status | |---|---| | NL → SQL (deterministic kernel) | Operational | | SemanticGraph (join resolution) | Operational | | 10 source parsers | Operational | | Scan CLI (boyce scan) | Operational | | PostgresAdapter (read-only) | Operational | | EXPLAIN pre-flight validation | Operational | | NULL Trap detection | Operational | | Redshift 1.0 safety linting | Operational | | Snapshot persistence across restarts | Operational | | Audit logging (append-only JSONL) | Operational | | Business definitions (ingest_definition) | Operational | | DSN persistence (ConnectionStore) | Operational | | Environment diagnostics (boyce doctor / check_health) | Operational | | Multi-snapshot merge | Planned |
---
Support
- Troubleshooting guide: docs/troubleshooting.md
- Local LLM setup (Ollama/vLLM): docs/local-llm-setup.md
- Bug reports: GitHub Issues
- Setup help: GitHub Issues
- Email: will@convergentmethods.com — for issues involving credentials or sensitive config
---
Copyright 2026 Convergent Methods, LLC. MIT License.






