cc-efficient

Tiered-model workflow plugin for Claude Code. Routes work to the cheapest model that won't degrade quality, enforces cache + session hygiene, and helps Pro/Max plans last longer without losing output quality.

What it does

When activated, Claude Code:

Ships four tier-pinned agents the main thread delegates to, instead of asking the user to switch models manually:
recall-haiku — search/lookup, returns path:line + ≤50w summary
transform-sonnet — applies a defined change across files, returns a diff
reason-opus — design / trade-off / root-cause, returns a numbered plan (no code)
triage-haiku — compresses pasted logs / stack traces / JSON dumps to the top 20 error lines + 1-line root-cause
Classifies every prompt via a UserPromptSubmit hook into Recall / Transform / Reason / Triage and suggests the matching agent — fires once per turn, silent on ambiguous prompts and slash commands
Tiered autonomy: auto-delegates Recall and Triage silently, announces Transform and proceeds unless objected, asks before Reason
Statusline shows current model, approximate context usage (%K used / 200K, colour-shifted), and the last agent spawned — so you can self-trigger /compact or delegation before the classifier nags
Override flags (--escalate, --inline, --quiet) typed in the prompt control delegation behaviour for the turn / session
Flags cache and session hygiene issues — long sessions, mid-session CLAUDE.md edits, missing .claudeignore
Enforces custom-agent design rules when you build new agents (single responsibility, pinned model:, restricted tools)

The skill is content-only — no opinions about what you build, just how the model resources are spent building it.

Install (per developer)

Each team member runs these in Claude Code:

/plugin marketplace add Kilowott-HQ/cc-efficient
/plugin install cc-efficient@cc-efficient

To activate during a session: type /efficient.

Activation

The skill activates when:

The user types /efficient
The user asks "how should I approach X" or "what's the cheapest way to do Y"
The user says "save tokens", "be efficient", "optimize my usage"
A long session shows signs of cache decay

To deactivate: say "stop efficient" or "normal mode".

What's inside

cc-efficient/
├── .claude-plugin/
│   ├── plugin.json               # manifest (registers SessionStart + UserPromptSubmit hooks + statusline)
│   └── marketplace.json          # marketplace entry for sharing
├── agents/
│   ├── recall-haiku.md           # haiku-tier search/lookup agent
│   ├── transform-sonnet.md       # sonnet-tier code-transformation agent
│   ├── reason-opus.md            # opus-tier design/trade-off agent (no Edit/Write)
│   └── triage-haiku.md           # haiku-tier paste/log triage agent
├── hooks/
│   ├── cc-efficient-prime.js       # SessionStart — primes Claude (does NOT load full skill)
│   ├── cc-efficient-classify.js    # UserPromptSubmit — classifies prompt → suggests agent
│   └── cc-efficient-statusline.js  # statusline — shows model, context %, last agent
├── skills/
│   └── efficient/
│       └── SKILL.md              # the skill — instructions Claude follows when active
└── README.md

How the pieces fit together:

1. SessionStart hook primes Claude with one line: skill is available, agent zoo exists, tiered-autonomy rules. Negligible context cost. 2. UserPromptSubmit hook classifies each prompt into Recall / Transform / Reason / Triage and suggests the matching agent. Honours --escalate, --inline, --quiet overrides typed in the prompt. 3. Statusline runs every refresh, parses the transcript for the latest usage numbers and most recent agent spawn, and renders efficient | sonnet | 47% (94K/200K) | ⟶ recall-haiku. Colour shifts green → yellow → red at 60% / 80%. 4. Agents live in agents/ with model: pinned in frontmatter, tool grants tier-locked (Haiku and Opus agents cannot Edit), and strict output contracts so the main thread absorbs short structured returns instead of agent prose. 5. Skill loads only when Claude detects the task needs the full ruleset (tierable work, cache hygiene issues, or explicit /efficient invocation).

This is the "best of both" pattern: every session starts aware, every turn gets a cheap classifier nudge, the statusline gives a constant visual cue, and the full ruleset only loads when actually relevant — keeping context lean.

Override flags

Type these anywhere in your prompt to control delegation for the turn or session:

| Flag | Effect | |---|---| | --escalate | Force reason-opus for this turn. Stop suggesting downgrades for the rest of the session. | | --inline | Do not delegate this turn — handle on the main thread. | | --quiet | Suppress Transform/Reason suggestions for the session. Recall and Triage still auto-delegate silently (that's the default). |

Customising for your team

Fork the repo, edit skills/efficient/SKILL.md, push. Areas worth team-specific tuning:

.claudeignore template — add framework-specific ignores (Laravel storage/, Rails tmp/, etc.)
Custom-agent design rules — if your team has its own agent conventions, add them
Anti-patterns — when you find a new way the team accidentally burns tokens, add it to the list

Companion file: CLAUDE-USAGE.md

The skill is what Claude follows. The repo also includes a human-readable team playbook at CLAUDE-USAGE.md — that's what people read to understand why the skill works the way it does. Keep both in sync when one changes.

Related tools

caveman — token-compressed output mode. Pairs well: caveman cuts output tokens, cc-efficient cuts input + cache + model-tier waste.

Author

AJ - Kilowott

License

MIT — fork freely.

cc-efficient

Summary

Install to Claude Code

cc-efficient

What it does

Install (per developer)

Activation

What's inside

Override flags

Customising for your team

Companion file: CLAUDE-USAGE.md

Related tools

Author

License

Related plugins

Plugins by category