relentless-data-skills
Agent skills for data and analytics engineers, organised around the end-to-end data lifecycle. v1.5 completes the spine: Stage 0 — Requirements & Discovery — deeply, plus Stage 1 — Generation, Stage 2 — Ingestion, Stage 3 — Storage, Stage 4 — Transformation, and Stage 5 — Serving — built around the same discipline: interrogate every artefact before a pipeline is written, and refuse to produce a downstream artefact that does not chain back to the upstream one. Stage 5 closes the loop — a Serving Surface names both the model it reads and the decision it informs — so the lineage graph is now a closed cycle from the founding question back to it. Built for Claude Code; structured so each skill produces a durable, version-controlled artefact that downstream stages can consume.
Why this exists
Generic "data requirements templates" produce filled-in forms that look thorough but quietly accept asks that should have been killed — nothing in them forces the question "if the answer flipped, would anything actually change?" This repo encodes that question (and a few others) as agent skills, so pipelines and dashboards stop getting built for decisions nobody can later name.
Install
Skills are self-contained — install one, or install the set (ADR-0004). Each skill folder carries its own contract (references/<artefact>.md) and linter (scripts/lint-<artefact>.sh); the shared worldview (CONTEXT.md) and lifecycle reference are linked by absolute URL as optional deeper reading, so a lone skill folder works on its own.
skills.sh (third-party installer):
npx skills add sleeplessv/relentless-data-skills --skill <name> # one skill
npx skills add sleeplessv/relentless-data-skills # the whole set
Claude Code plugin (the whole set as one selectable plugin):
/plugin marketplace add sleeplessv/relentless-data-skills
/plugin install relentless-data-skills@relentless-data-skills
Pick the single relentless-data-skills plugin and all skills install together; reload with /reload-plugins.
Local development. The plugin path works from a local clone too: /plugin marketplace add <path-to-clone> then /plugin install relentless-data-skills@relentless-data-skills, reloaded with /reload-plugins. The repo ships both .claude-plugin/plugin.json (the plugin) and .claude-plugin/marketplace.json (the single-plugin marketplace pointing at it).
v1.5 skills
Stage 0 — Requirements & Discovery:
/gather-requirements— interrogates a stakeholder ask into one or more Decision-Grade Question (DGQ) documents underdocs/requirements/; enforces the action-change kill-switch (if nothing would change when the answer flips, the ask is rejected, not built) and bundles its own contract (references/dgq.md) and linter (scripts/lint-dgq.sh) inside the skill folder for schema enforcement (ADR-0004 pilot)./write-adr— scaffolds a MADR-lite Architecture Decision Record underdocs/adr/for cross-cutting platform commitments; refuses single-option decisions and bundles its own contract (references/adr.md) and linter (scripts/lint-adr.sh) inside the skill folder for structural enforcement (ADR-0004).
Stage 1 — Generation:
/profile-source— interrogates an upstream source (database table, SaaS endpoint, streaming topic, partner feed) into a Source Profile document underdocs/sources-of-record/; enforces the Generation kill-switch (an accepted profile must name at least one consumer DGQ inconsumed_by— no consumer, no profile) and bundles its own contract (references/source-profile.md) and linter (scripts/lint-source-profile.sh) inside the skill folder for schema enforcement (ADR-0004).
Stage 2 — Ingestion:
/plan-ingestion— interrogates a planned (or undocumented) ingestion path into an Ingestion Plan document underdocs/ingestion-plans/; enforces the Ingestion kill-switch (an accepted plan must name at least one upstream Source Profile inrelated_sources_of_record— no profiled upstream, no plan) and bundles its own contract (references/ingestion-plan.md) and linter (scripts/lint-ingestion-plan.sh) inside the skill folder for schema enforcement (ADR-0004).
Stage 3 — Storage:
/plan-storage— interrogates a planned (or undocumented) storage destination into a Storage Plan document underdocs/storage-plans/; enforces the Storage kill-switch (an accepted plan must name at least one upstream Ingestion Plan inrelated_ingestion_plans— no planned upstream, no plan) and bundles its own contract (references/storage-plan.md) and linter (scripts/lint-storage-plan.sh) inside the skill folder for schema enforcement (ADR-0004). Warehouse, lakehouse, object-store, operational-store, search-index, and cache are equally first-class; the interview branches on the user's answer rather than defaulting to any.
Stage 4 — Transformation:
/model-data— interrogates a planned (or undocumented) transformation into a Model Plan document underdocs/model-plans/; enforces the Transformation kill-switch (an accepted plan must name at least one upstream Storage Plan inrelated_storage_plans— a model that does not chain back to governed storage is ungoverned business logic, not a real table) and bundles its own contract (references/model-plan.md) and linter (scripts/lint-model-plan.sh) inside the skill folder for schema enforcement (ADR-0004). Grain is asked first and must resolve to a concrete cardinality;assertionsforces an honestbusiness-invariant/shape-only/noneposture; staging/intermediate/mart/metric/feature layers and streaming/batch materialisation are equally first-class.
Stage 5 — Serving:
/serve-data— interrogates a planned (or undocumented) serving endpoint into a Serving Surface document underdocs/serving-surfaces/; enforces the dual Serving kill-switch — the differentiator — where an accepted surface must name both an upstream Model Plan inrelated_model_plans(the chain half — no governed model means the surface recomputes business logic from raw/staging) and a DGQ it informs inserves_dgqs(the loop-closure half — a surface that informs no decision is the dashboard-nobody-uses, retired exactly as the ask-nobody-acts-on is killed at Stage 0), and bundles its own contract (references/serving-surface.md) and linter (scripts/lint-serving-surface.sh) inside the skill folder for schema enforcement (ADR-0004).access_modelforces an honestconvention-vs-role-basedposture; dashboard/api/reverse-etl/feature-store/export/embedded surfaces and streaming/batch delivery are equally first-class.
This is the stage that closes the lineage loop: DGQ → Source Profile → Ingestion Plan → Storage Plan → Model Plan → Serving Surface → DGQ. With it, the Reis-5-plus-Stage-0 spine ships end to end.
Cross-cutting skills (meta) — outside the numbered spine, per ADR-0002:
/triage-lineage— reads the consumer's repo-internal artefact graph (DGQs, Source Profiles, Ingestion Plans, Storage Plans, Model Plans, Serving Surfaces, ADRs underdocs/) and prints a prioritised, two-tier report — Tier 0 dangling references first (these pass every linter today), then real gaps ranked by distance to a live decision — routing each finding into the stage skill that closes it. It is an advisor (meta) skill: it operates over the artefact graph, authors no durable artefact of its own, and has no linter. It propagates the action-change kill-switch across the graph — a gap that no live decision pulls on is routed to retire / recordrejected, never to build — so it cannot regress into a flat "fix everything" coverage list. Repo-internal self-audit only (no dbt-manifest, live-warehouse, or orchestrator scans)./setup-relentless-data-skills— a one-time setup ritual (not an advisor): scaffolds a consumer project to use the lifecycle skills. Writes a thin## Data skillsindex block into the project'sCLAUDE.md/AGENTS.mdand creates only thedocs/<stage-area>/directories for the stages the project will actually populate — scaffold only what will hold content. It teaches each stage's kill-switch in the block it writes, is idempotent, never clobbers, leaves any## Agent skillsblock untouched, and isdisable-model-invocation(run it explicitly). Modelled on thesetup-matt-pocock-skillsscaffold pattern. It's a second kind of non-spinemeta/skill (a setup / scaffold skill) alongside the advisor skills — seeskills/meta/README.mdfor why that broadening is recorded in the living docs rather than its own ADR.
Stage status
| Stage | Name | v1.5 status | | --- | --- | --- | | 0 | Requirements & Discovery | ✅ shipped | | 1 | Generation | ✅ shipped | | 2 | Ingestion | ✅ shipped | | 3 | Storage | ✅ shipped | | 4 | Transformation | ✅ shipped | | 5 | Serving | ✅ shipped |
The spine is complete — every stage from the founding question to the consumer-facing surface now ships a skill, an artefact, and a linter. The lifecycle spine follows the Reis 5-stage taxonomy (Generation → Ingestion → Storage → Transformation → Serving), with Stage 0 prepended for upstream-of-code clarity. Streaming and batch are equally first-class; the repo refuses to bias toward either.
Repo layout
relentless-data-skills/
├─ README.md
├─ CONTEXT.md repo vocabulary, principles, conventions
├─ CLAUDE.md Claude-Code-specific pointer + guardrails
├─ LICENSE MIT
├─ .claude-plugin/plugin.json plugin manifest (skills declared here)
├─ docs/
│ ├─ data-engineering-101.md distilled lifecycle reference
│ ├─ adr/ Architecture Decision Records (written by /write-adr)
│ ├─ agents/ agent-skill docs (issue tracker, triage labels, domain)
│ ├─ evals.md skill eval-suite spec + run instructions
│ └─ sources/ source provenance (Redpanda capture + URL index)
└─ skills/
├─ 00-requirements/ /gather-requirements, /write-adr
├─ 01-generation/ /profile-source
├─ 02-ingestion/ /plan-ingestion
├─ 03-storage/ /plan-storage
├─ 04-transformation/ /model-data
├─ 05-serving/ /serve-data
└─ meta/ /triage-lineage, /setup-relentless-data-skills (cross-cutting; no lifecycle position — ADR-0002)
Stage folders use zero-padded two-digit numeric prefixes so order is encoded in the listing and new stages can be inserted without renumbering. Skill folder names are verb-led and kebab-case, matching their command form (/gather-requirements, not /requirements-gathering). Cross-cutting skills with no lifecycle position — advisor skills (ADR-0002) and setup / scaffold skills — live in the non-numbered skills/meta/ folder. Each skill folder additionally carries its own references/<artefact>.md (the contract) and scripts/lint-<artefact>.sh (the linter), so it installs self-contained (ADR-0004).
Read next
CONTEXT.md— repo vocabulary, principles, conventions. The substance lives here; the README is the front door.docs/data-engineering-101.md— distilled, opinionated lifecycle reference. Stage definitions, failure modes, undercurrents, glossary.docs/sources/— source provenance for the lifecycle reference (Redpanda capture, annotated URL index, refresh procedure).
Acknowledgements
Structural inspiration: mattpocock/skills. This is a clean-room implementation with an independent lifecycle taxonomy and domain-specific content, but the folder shape and the "skills as a Claude Code plugin" mechanic come from there. Thanks Matt.
Licence
MIT — see LICENSE.





