token-efficient
> Cut your AI token consumption 2-3x without losing output quality.
A universal agent skill that eliminates filler, calibrates depth to your expertise, and compresses responses across all task types — coding, architecture, writing, and research.
Install
Claude Code (via skills.sh)
npx skills add YOUR_USERNAME/token-efficient-skill
Claude Code (via plugin marketplace)
/plugin marketplace add YOUR_USERNAME/token-efficient-skill
Claude.ai Download token-efficient.skill from Releases and drag it into your Claude.ai project settings.
Manual (any SKILL.md-compatible agent)
cp -r token-efficient/ ~/.claude/skills/token-efficient/
Also works with: Cursor, Cline, Windsurf, Codex, Copilot, Roo, AMP, and any agent supporting the SKILL.md standard.
What It Does
Most AI responses waste 40-60% of tokens on:
- Preambles ("Great question! I'd be happy to help...")
- Echoing your question back
- Over-explaining concepts you already know
- Excessive formatting for simple answers
- Sign-offs ("Let me know if you need anything!")
- Redundant summaries
This skill eliminates all of that systematically.
Three Modes
| Mode | Savings | Best For | |------|---------|----------| | Balanced | 40-50% | Daily coding, docs, emails | | Aggressive | 60-75% | Quick fixes, known tech, yes/no answers | | Context-Aware (default) | 45-65% | Mixed conversations — auto-adjusts per message |
Switch modes mid-conversation:
"go aggressive"/"compress more"/"shorter""balanced mode"/"more detail""auto"/"context-aware"/"default"
Examples
Code Fix — Before (150 tokens) vs After (40 tokens)
Before: > This error typically occurs when you're trying to call the .map() method on a variable that is undefined. This usually happens when the data hasn't loaded yet... > Here are the common causes: 1. The data hasn't loaded yet... 2. The initial state... 3. The property path... > I hope this helps!
After: > items?.map(...) or initialize state: useState([]) > If from API: guard with if (!data) return <Loading />
Architecture Decision — Before (280 tokens) vs After (120 tokens)
Before: Three paragraphs about each option, a recommendation section, and "Let me know if you'd like me to elaborate!"
After: A comparison table with the key tradeoffs, one-line recommendation with reasoning.
How It Works
The skill defines: 1. 8 anti-patterns to eliminate (the biggest token wasters) 2. Domain-specific compression rules for code, architecture, writing, and research 3. Conversation-level efficiency — no re-introducing context, incremental updates, batch responses 4. Dynamic calibration — matches formatting complexity to content complexity
License
MIT





