<div align="center">
<img src="assets/wordmark.svg" alt="agentburn — where does your AI agent burn money, while you sleep?" width="420">
<br>
<a href="https://pypi.org/project/agentburn/"><img alt="PyPI" src="https://img.shields.io/pypi/v/agentburn?color=f7775a"></a> <img alt="Python" src="https://img.shields.io/badge/python-3.9%2B-5ab0f7"> <img alt="zero deps" src="https://img.shields.io/badge/dependencies-0-7df0a8"> <img alt="offline checks" src="https://img.shields.io/badge/offline_checks-104-c89bf7"> <a href="LICENSE"><img alt="MIT" src="https://img.shields.io/badge/license-MIT-8a949e"></a>
<br><br>
<img src="assets/demo.svg" alt="uvx agentburn — animated demo: TL;DR verdict, burn bars by source, the overnight bill, and what to change" width="760">
<br>
Hermes Agent · OpenClaw · Claude Code — one normalized core, local, read-only, zero dependencies
uvx agentburn
▶ Try it in your browser — no install
</div>
---
What is this, in plain words?
You run an AI assistant — OpenClaw, Hermes Agent, or Claude Code. It works around the clock: answers you in Telegram, runs scheduled jobs at night, spawns helper agents. Every word it reads and writes costs money — and the bill arrives as one number with no explanation.
agentburn is a free tool that reads your assistant's own diary (log files already sitting on your computer) and turns that number into answers:
| You ask | It answers | Command | |---|---|---| | Where does the money actually go? | scheduled jobs 79% · chats 7% · helpers 5% · you 9% | agentburn | | What happened while I slept? | the overnight bill, isolated and named | agentburn | | Why is it so expensive? | same file read 14×, a broken tool retried 6×, wake-ups that did nothing | agentburn why | | What did it do in Telegram? | every function it called there, with counts and errors | agentburn why --source telegram | | What exactly do I change? | ready config lines + expected saving in dollars | agentburn fix | | Am I paying for a dying model? | your spend vs the world's 4-week trend, with a cheaper rising alternative | agentburn drift | | Is my setup even normal? | "your overhead is worse than ~75% of N setups" | agentburn rank | | Can I just ask the assistant? | yes — install the skill/MCP and ask "where do you burn my money?" | agentburn mcp |
No accounts. No cloud. Nothing leaves your computer. One command to try: uvx agentburn.
Why this exists
Always-on agents bill you around the clock — and their built-in counters only show totals. Real threads that made this tool:
"73% of every API call is fixed overhead — ~13.9K tokens of tool definitions and system prompt, resent every time." — hermes-agent #4379
"One entrant wrote about waking up to a $47 surprise bill from an overnight run — that's not an exotic failure, it's the default behavior of an unsupervised loop." — dev.to
"I've seen runs where step 3 costs 4× step 1 — no alert, just a bill." — comment, ibid.
agentburn reads the agent's own accounting data (read-only) and answers the question the totals never do: where.
What it answers
- Where it burns — by source:
cron/subagent/gateway:telegram|discord|whatsapp/cli. Always-on ≠ free: scheduled jobs and gateways spend without you. - 🌙 While you slept — the overnight bill, isolated and named (configurable window:
--night 23-7). - Fixed overhead — average input tokens per API call per source. The "73% overhead" pattern is visible in one glance; with request dumps enabled, you get the sampled composition (system prompt vs tool definitions vs history).
- Subagent rollups — delegation cost chained back to the session that spawned it. Recursion compounds; here is the receipt.
- Top tools — which tool results weigh most in your context.
- What to do — up to 4 conservative, named recommendations with monthly estimates.
How it compares
| | agentburn | ccusage | codeburn | built-in /usage | |---|---|---|---|---| | Burn by source (cron · heartbeat · gateways · subagents) | ✅ | — | — | % only, 7 days, this machine | | 🌙 the overnight bill, isolated | ✅ | — | — | — | | Behavioral forensics (why: loops, retry storms, failed-run cost) | ✅ | — | — | — | | Ready config patches (fix, source-verified keys) | ✅ | — | — | — | | Accounting-gap detection (doctor, lower-bound honesty) | ✅ | — | — | — | | MCP server (agent answers for its own bill) | ✅ | — | — | — | | Totals / live blocks / many CLIs | basic | ✅ best-in-class | ✅ TUI, 25 providers | totals |
As of June 2026; ccusage and codeburn are excellent at what they do — agentburn deliberately starts where they stop (ccusage scoped per-tool analysis out).
Why trust these numbers
Most token trackers quietly disagree with each other (2–91× in public issue threads). agentburn takes the opposite stance:
- Numbers come from the agent's own accounting (
~/.hermes/state.db: per-session token counters and cost fields). No scraping, no proxies, no guessing. - Provider-billed costs are shown as-is; Hermes estimates are marked with
~. Mixed data is labeled mixed. - Sessions with messages but zero recorded tokens (known Hermes accounting gaps, e.g. #12023) are detected and reported: totals are then explicitly a lower bound — and fixing the accounting becomes recommendation #1.
- Input composition from request dumps is char-proportional and labeled sampled estimate, not truth.
Privacy
Everything runs locally and reads your database read-only. No network calls. No telemetry. The report is yours.
Usage
agentburn # every agent on this machine, last 30 days
agentburn --agent openclaw # just one
agentburn --days 7
agentburn --agent hermes --db /path/to/state.db
agentburn why # behavioral forensics: loops, retry storms, idle heartbeats
agentburn why --source telegram # decompose ONE source: functions called, errors, loops
agentburn --source cron # cost report for one source only
agentburn explain --model llama3.1 # LLM reads the numbers back to you (local by default)
agentburn --night 23-7 # custom overnight window (local time)
agentburn --budget-month 50 --fail-over # sentinel for cron/CI
agentburn --json # machine-readable, pipe it anywhere
agentburn --no-color
Mechanics
📤 Share your burn (--share). An anonymized card — categories, models and totals only; session titles, paths and content are excluded by construction. Safe to paste into a post; --svg card.svg renders the same card as an image:
🔥 my hermes agent · last 30d
~$45.50 → ~$430/mo pace · 1.75M tokens
where it burns: cron 79% · cli 9% · telegram 7% · subagent 5%
🌙 while I slept (00–08): ~$36.00 — 79% of everything
⚙️ telegram re-sends 20,000 tokens with EVERY call — 2.5× the community norm (≈8k)
— agentburn · local & private
--svg card.svg renders it as an image:
📏 Calibration against public benchmarks. "Is 15k input tokens per call normal?" The report compares your fixed overhead with community-measured references embedded as dated constants (e.g. the Phala always-on-agent benchmark, 2026-03: ≈8k/call baseline). No network — sources are cited inline.
📐 Optimize → prove it (--save-baseline / --compare). Snapshot your pace, change the config (cheaper cron model, trimmed toolsets), then agentburn --compare shows the delta in $/month — pace-normalized, so a 7-day baseline compares honestly with a 30-day window. Every recommendation becomes a testable promise.
🔬 agentburn why — behavioral forensics. report says where it burns; why says why, from the agent's own recorded actions and thoughts:
🔬 agentburn why — openclaw · gateway:telegram
WHAT IT ACTUALLY DID browser 34× ≈210K in results · web_search 18× · shell 7× (2 errors)
RE-READ LOOPS 5× browser(https://news.site/page) — every repeat re-paid in full
RETRY STORMS Bash: 3 errors / 6 calls — paying full price for every error
IDLE HEARTBEATS 4 of 9 heartbeat runs did NOTHING — $2.40 of pure idle burn
BURNED ON FAILURES 2 failed runs → ~$3.90 (timeout, killed)
THINKS MORE THAN IT WORKS 62% thinking · 84K tokens · "rename files task"
💡 WHAT TO CHANGE
1. `/proj/big.md` was fetched 4× in one session ≈32K tokens re-paid — cache it…
Observations with numbers, not verdicts; only tool names, truncated argument keys and counters — message content never leaves the machine (and never enters the report).
🧠 agentburn explain — LLM interpretation, local-first. The numbers, read back to you in plain language with ranked actions:
agentburn explain --model llama3.1 # local ollama — nothing leaves the machine
agentburn explain --llm https://openrouter.ai/api/v1 \
--model deepseek/deepseek-chat --yes-remote --lang ru # remote: explicit opt-in only
Privacy rules are hard-coded: the default endpoint is localhost (ollama / LM Studio); a remote endpoint requires --yes-remote and receives a redacted summary only — session titles become session-N, file paths shrink to basenames, message content is never in the payload to begin with. Works with any OpenAI-compatible API, zero new dependencies. (Yes — a cost profiler spending ~3K tokens to explain costs. The payload is compact and the answer capped; the irony is acknowledged.)
🧭 agentburn drift — your spend × the world's direction. Are you paying for a model the world is leaving?
🧭 agentburn drift
YOUR MODELS vs THE WORLD (4-week world trend)
anthropic/claude-opus-4.6 ~$341/mo world -41% ⬊
deepseek/deepseek-v3.2 ~$12/mo world +12% →
💡 DRIFT ALERTS
1. claude-opus-4.6: you spend ~$341/mo; world usage -41% in 4 weeks — the world
is leaving this model. Rising alternative step-3.5-flash (+180%) is ~98% cheaper.
Your side is computed locally from the agents' own logs; the world side is one read-only GET of token-history's public trend JSON (archived daily from OpenRouter's rankings — deep history unlocks as the archive grows). Nothing about you is sent anywhere; --trends FILE works fully offline. Nobody else joins these two halves.
🩺 agentburn why additions: CRON RUNS — the per-run receipt for every scheduled job (what openclaw #24636 keeps asking for), and CONTEXT THRASH — compactions counted per session, because every compaction silently re-sends a near-full context window.
📊 agentburn rank + --submit — the Burn Index. Anonymous community percentiles of efficiency — the benchmark volume-leaderboards can't be: nothing here rewards burning more.
📊 agentburn rank — you vs the Burn Index
input tokens per call · cli you: 15,000 median: 6,200 worse than ~75% of 41
share of spend at night you: 79.0% median: 12.0% worse than ~90% of 41
cache-read share of input volume you: 61.0% median: 44.0% better than ~75% of 41
Joining is consent-by-click: agentburn --submit prints the exact anonymized payload (ratios and a coarse spend band — never raw volumes, titles or paths), then a prefilled GitHub-issue link that you open and submit. A weekly Action aggregates submissions with plausibility bounds (junk and flexing get dropped, not ranked) into public quantiles.
🔧 agentburn fix — from findings to ready config patches (dry-run by design). Not "consider a cheaper model" but the exact file and the exact lines:
🔧 agentburn fix — hermes · DRY-RUN (nothing was changed)
1. Point Hermes cron jobs at a cheap model
file : ~/.hermes/cron/jobs.json
why : cron is 79% of spend; maintenance rarely needs a frontier model.
effect : bulk of ≈$341/mo moves to cheap-model pricing
proposed:
"nightly digest": "model": "deepseek/deepseek-chat"
ⓘ field verified in hermes-agent cron/jobs.py: per-job `model` override
Patch generators exist only for config keys verified against the agents' source code (Hermes cron/jobs.json, OpenClaw agents.defaults.heartbeat incl. activeHours — the night-burn killer — and lightContext). There is no --apply on purpose: paste it yourself, then prove the saving with --save-baseline → --compare.
🔌 agentburn mcp — your agent answers for its own bill. A zero-dependency MCP stdio server exposing burn_report / burn_why / burn_card. Register it and ask the agent "where do you burn my money?" — it calls the profiler on its own database and explains:
# Claude Code
claude mcp add agentburn -- agentburn mcp
# Hermes / OpenClaw: add an stdio MCP server with command `agentburn mcp`
Prefer skills? There's a ready SKILL.md — drop it into ~/.hermes/skills/agentburn/, ~/.openclaw/skills/agentburn/ or ~/.claude/skills/agentburn/ and just ask the agent "where do you burn my money?".
🩺 agentburn doctor. Trackers disagree because the agent's own accounting has gaps. doctor names the broken combinations (provider × model × source) for zero-usage and unpriced sessions, and generates a ready-to-paste upstream bug report — counters only, no message content.
🚨 Sentinel mode — a budget guard for server agents. Your agent runs 24/7 on a VPS; this watches it:
# alert when overnight burn exceeds $5/month pace (exit code 1 → any alerting hooks in)
agentburn --agent openclaw --budget-night 5 --fail-over --no-color \
|| notify-send "🚨 agent is burning money at night"
Drop it in cron next to the agent itself — the one-off check becomes a standing guard.
Supported agents
One normalized model, one adapter per agent. Run agentburn and every agent found on the machine gets its own report.
| Agent | Status | Data source | Notes | |---|---|---|---| | Hermes Agent | ✅ | ~/.hermes/state.db (+ optional request dumps) | costs from the agent's own accounting | | OpenClaw | ✅ | ~/.openclaw/agents//sessions/sessions.json | heartbeat is its own category — the famous one; cron / gateways / subagents split out | | Claude Code | ✅ | ~/.claude/projects/*.jsonl | tokens only, by design: CC doesn't record costs locally and subscription usage has no honest per-token price — we don't invent one |
Adapters are ~150 lines over a shared model. Codex CLI / opencode are natural next targets — PRs welcome.
<div align="center"><img src="assets/architecture.svg" alt="architecture: agent data → adapters → normalized model → report/why/fix/explain/doctor/mcp" width="780"></div>
Related
token-history — the macro view: daily archive of which agents the world uses (OpenRouter rankings). agentburn is the micro view: where yours burns.
License
MIT
<sub>mcp-name: io.github.Socialpranker/agentburn</sub>
---
<div align="center">
*the token-\ family · token-history — which agents the world runs · agentburn** — where yours burns
if this saved you a dinner's worth of tokens, a ⭐ helps the next person find it
</div>






