feature-torture
A decision-discipline skill for Claude Code (and compatible AI coding assistants). Pressure-tests one roadmap feature per session and emits a 500–900-word report a human can act on — verdict + ADR + adjacency map + spec stub when shipping.
What this is not
feature-torture is not a stress-test, load-test, fuzzer, or chaos-engineering tool. It does not run code, exercise APIs under load, or break a deployed system to see what breaks. It runs before implementation, on a roadmap entry. The "torture" in the name is the interrogation of an idea — does the slice make sense, what's the better cut, when should it ship — not the abuse of a built artefact.
If you landed here looking for stress / load / chaos testing, you want one of: Gatling, k6, Locust, Chaos Mesh, or Toxiproxy.
Why
A roadmap entry is a headline. A sprint plan is a commitment. The gap between them is where unverified assumptions hide. feature-torture forces the assumptions out before the sprint inherits them.
Most AI feature-review sessions drift: you ask "what should we do with X?", the model lists pros and cons, and you leave with a longer paragraph but no decision. feature-torture treats the AI as a decision facilitator, not a participant — it runs a structured diamond-loop process, picks 2–4 techniques from a closed list, and forces the verdict into one of six labels:
- 👍 ship — proceed as scoped, or with named amendments
- ✂️ reshape — keep the goal, change the slice
- ⏸ park — right idea, wrong cycle
- 🧬 split — entry hides multiple features; spawn children
- 👎 kill — drop the entry; reasons are load-bearing
- 🤷 defer-decision — only when a single probe would flip everything
Each non-default verdict triggers a post-converge cross-label challenge: the report has to argue why the two nearest-neighbour verdicts don't fit. Refutations live in Choice, making the verdict defensible at a glance.
When the input isn't a roadmap feature
When you point the skill at something that isn't a feature row — a release-cut block, a policy / semver question, a coupled bundle of decisions — it doesn't improvise a refusal. It detects the shape and emits a structured pick:
- (a) Reframe as feature — name an F-ID this stands in for, and the skill runs as normal against that row.
- (b) Torture as a single decision — keeps the diamond loop and the 6-label verdict, drops the F-ID / spawned-children / "make me dream" artefacts, writes
policy-<slug>.md. Hard cap: one decision per session — bundles are refused. - (c) Bail to
/brainstorm— when the input is divergent ideation rather than pressure-test-to-verdict shape.
Detection is automatic; there is no flag. Standard feature-row runs are unchanged.
Install
Claude Code (plugin channel)
/plugin install feature-torture@bastien-gallay/feature-torture
Claude Desktop or claude.ai web (.skill archive)
Download the latest feature-torture-vX.Y.Z.skill from the Releases page and:
- Desktop: double-click —
.skillis OS-registered and opens an Add to library dialog. - Web: Settings → Capabilities → Skills → upload.
vercel-labs/skills
npx skills install bastien-gallay/feature-torture
Quickstart
1. Bootstrap a config for your project. Copy the closest example from examples/ to .feature-torture.md at your repo root, edit the placeholders.
cp ~/.claude/skills/feature-torture/examples/lucid-lint.md .feature-torture.md
$EDITOR .feature-torture.md
2. Invoke the skill in Claude Code:
/feature-torture
The agent reads the config, picks one unstarted feature at random, runs the start-state check, and writes a report to {{REPORTS_DIR}}/<F-id>.md.
3. Read the report. If the verdict is 👍 or ✂️ with a spec stub, hand the spec to the next pair-programming session. If it's ⏸ / 🧬 / 👎 / 🤷, the report is the deliverable.
Configuration
feature-torture reads project config from, in order:
1. .feature-torture.md at project root. 2. .personal/feature-torture/config.md (gitignored variant). 3. Inferred values from AGENTS.md / CLAUDE.md / README.md.
The config supplies six placeholders: PROJECT_NAME, ROADMAP_PATH, PICK_SCRIPT (optional), ARCH_DOCS_PATH, REPORTS_DIR, and UNSTARTED_RULE. The examples/ directory ships five real configs covering five different roadmap shapes:
lucid-lint.md— Rust + bilingual docs project; markdown table with☐/🚧/✅glyphs.secondary-corpus.md— Python audit toolchain;- [ ]task list.daily-ops.md— Python CLI + Claude adapters;- [ ]task list with- [/]in-progress glyph.jira_analytics-FR-non-checkbox.md— French-language internal tool; H3-headings with✅on title for done,Statut : Corrigéinline for done-but-not-glyphed (multi-axis).rhetorix-suggestion-list.md— Suggestion-list document with no status concept; every numbered H3 is a candidate, status comes from grep.
Session output
A typical report has 11 required sections, ~500–900 words, ≥ 1 finding per 80 words. The shape is:
1. Roman thumb verdict (one of the 6 labels) 2. TL;DR (≤ 3 sentences) 3. Make me dream (≤ 80 words; skip if 👎) 4. Job to be done 5. Adjacency map (3 named neighbours) 6. Roadmap-placement challenge (with quantitative comparison) 7. ADR core (Problem / Current state / Options / Choice / Consequences) 8. Output type (decision-only or decision+spec) 9. Spawned children (0 to 5) 10. Open questions (2 to 5, domain-tagged) 11. Techniques used (2–4 used + the rest rejected)
Repo layout
feature-torture/
├── README.md # this file
├── CHANGELOG.md
├── LICENSE
├── CLAUDE.md # repo conventions
├── justfile # release automation
├── .claude-plugin/
│ ├── plugin.json
│ └── marketplace.json
├── skills/
│ └── feature-torture/
│ ├── SKILL.md # the canonical prompt with frontmatter
│ ├── tests.md # corpus criteria
│ └── improvements-queue.md # backlog of prompt-edit candidates
├── examples/ # 5 real config fixtures
├── docs/
│ └── RELEASING.md
└── dist/ # populated by `just release` / `just package`
Acknowledgements
Process design pulled from:
- ADR for the report spine.
- Kniberg Kata (Current → Target → Next Concrete Step) for the next-step closing move.
- SCAMPER and Devil's Advocate as stress techniques.
- Kano model, RICE / ICE, T-shirt sizing, Crazy-8s as the PM-flavor branch.
Distribution scaffolding mirrors bfw — same justfile, same release flow, same .skill package shape.
License
MIT — see LICENSE.




