cc-efficient
Tiered-model workflow plugin for Claude Code. Routes work to the cheapest model that won't degrade quality, enforces cache + session hygiene, and helps Pro/Max plans last longer without losing output quality.
What it does
When activated, Claude Code:
- Ships four tier-pinned agents the main thread delegates to, instead of asking the user to switch models manually:
recall-haiku— search/lookup, returnspath:line+ ≤50w summarytransform-sonnet— applies a defined change across files, returns a diffreason-opus— design / trade-off / root-cause, returns a numbered plan (no code)triage-haiku— compresses pasted logs / stack traces / JSON dumps to the top 20 error lines + 1-line root-cause- Classifies every prompt via a
UserPromptSubmithook into Recall / Transform / Reason / Triage and suggests the matching agent — fires once per turn, silent on ambiguous prompts and slash commands - Tiered autonomy: auto-delegates Recall and Triage silently, announces Transform and proceeds unless objected, asks before Reason
- Statusline shows current model, approximate context usage (
%K used / 200K, colour-shifted), and the last agent spawned — so you can self-trigger/compactor delegation before the classifier nags - Override flags (
--escalate,--inline,--quiet) typed in the prompt control delegation behaviour for the turn / session - Flags cache and session hygiene issues — long sessions, mid-session
CLAUDE.mdedits, missing.claudeignore - Enforces custom-agent design rules when you build new agents (single responsibility, pinned
model:, restricted tools)
The skill is content-only — no opinions about what you build, just how the model resources are spent building it.
Install (per developer)
Each team member runs these in Claude Code:
/plugin marketplace add Kilowott-HQ/cc-efficient
/plugin install cc-efficient@cc-efficient
To activate during a session: type /efficient.
Activation
The skill activates when:
- The user types
/efficient - The user asks "how should I approach X" or "what's the cheapest way to do Y"
- The user says "save tokens", "be efficient", "optimize my usage"
- A long session shows signs of cache decay
To deactivate: say "stop efficient" or "normal mode".
What's inside
cc-efficient/
├── .claude-plugin/
│ ├── plugin.json # manifest (registers SessionStart + UserPromptSubmit hooks + statusline)
│ └── marketplace.json # marketplace entry for sharing
├── agents/
│ ├── recall-haiku.md # haiku-tier search/lookup agent
│ ├── transform-sonnet.md # sonnet-tier code-transformation agent
│ ├── reason-opus.md # opus-tier design/trade-off agent (no Edit/Write)
│ └── triage-haiku.md # haiku-tier paste/log triage agent
├── hooks/
│ ├── cc-efficient-prime.js # SessionStart — primes Claude (does NOT load full skill)
│ ├── cc-efficient-classify.js # UserPromptSubmit — classifies prompt → suggests agent
│ └── cc-efficient-statusline.js # statusline — shows model, context %, last agent
├── skills/
│ └── efficient/
│ └── SKILL.md # the skill — instructions Claude follows when active
└── README.md
How the pieces fit together:
1. SessionStart hook primes Claude with one line: skill is available, agent zoo exists, tiered-autonomy rules. Negligible context cost. 2. UserPromptSubmit hook classifies each prompt into Recall / Transform / Reason / Triage and suggests the matching agent. Honours --escalate, --inline, --quiet overrides typed in the prompt. 3. Statusline runs every refresh, parses the transcript for the latest usage numbers and most recent agent spawn, and renders efficient | sonnet | 47% (94K/200K) | ⟶ recall-haiku. Colour shifts green → yellow → red at 60% / 80%. 4. Agents live in agents/ with model: pinned in frontmatter, tool grants tier-locked (Haiku and Opus agents cannot Edit), and strict output contracts so the main thread absorbs short structured returns instead of agent prose. 5. Skill loads only when Claude detects the task needs the full ruleset (tierable work, cache hygiene issues, or explicit /efficient invocation).
This is the "best of both" pattern: every session starts aware, every turn gets a cheap classifier nudge, the statusline gives a constant visual cue, and the full ruleset only loads when actually relevant — keeping context lean.
Override flags
Type these anywhere in your prompt to control delegation for the turn or session:
| Flag | Effect | |---|---| | --escalate | Force reason-opus for this turn. Stop suggesting downgrades for the rest of the session. | | --inline | Do not delegate this turn — handle on the main thread. | | --quiet | Suppress Transform/Reason suggestions for the session. Recall and Triage still auto-delegate silently (that's the default). |
Customising for your team
Fork the repo, edit skills/efficient/SKILL.md, push. Areas worth team-specific tuning:
.claudeignoretemplate — add framework-specific ignores (Laravelstorage/, Railstmp/, etc.)- Custom-agent design rules — if your team has its own agent conventions, add them
- Anti-patterns — when you find a new way the team accidentally burns tokens, add it to the list
Companion file: CLAUDE-USAGE.md
The skill is what Claude follows. The repo also includes a human-readable team playbook at CLAUDE-USAGE.md — that's what people read to understand why the skill works the way it does. Keep both in sync when one changes.
Related tools
- caveman — token-compressed output mode. Pairs well:
cavemancuts output tokens,cc-efficientcuts input + cache + model-tier waste.
Author
License
MIT — fork freely.




