relentless-data-skills

relentless-data-skills

OtherClaude Codeby sleeplessv

Summary

Agent skills for data and analytics engineers, organised around the end-to-end data lifecycle — Stage 0 (Requirements & Discovery) through Stage 5 (Serving), plus meta skills (setup, triage-lineage).

Install to Claude Code

/plugin install relentless-data-skills@relentless-data-skills

Run in Claude Code. Add the marketplace first with /plugin marketplace add sleeplessv/relentless-data-skills-de if you haven't already.

README.md

relentless-data-skills

Agent skills for data and analytics engineers, organised around the end-to-end data lifecycle. v1.5 completes the spine: Stage 0 — Requirements & Discovery — deeply, plus Stage 1 — Generation, Stage 2 — Ingestion, Stage 3 — Storage, Stage 4 — Transformation, and Stage 5 — Serving — built around the same discipline: interrogate every artefact before a pipeline is written, and refuse to produce a downstream artefact that does not chain back to the upstream one. Stage 5 closes the loop — a Serving Surface names both the model it reads and the decision it informs — so the lineage graph is now a closed cycle from the founding question back to it. Built for Claude Code; structured so each skill produces a durable, version-controlled artefact that downstream stages can consume.

Why this exists

Generic "data requirements templates" produce filled-in forms that look thorough but quietly accept asks that should have been killed — nothing in them forces the question "if the answer flipped, would anything actually change?" This repo encodes that question (and a few others) as agent skills, so pipelines and dashboards stop getting built for decisions nobody can later name.

Install

Skills are self-contained — install one, or install the set (ADR-0004). Each skill folder carries its own contract (references/<artefact>.md) and linter (scripts/lint-<artefact>.sh); the shared worldview (CONTEXT.md) and lifecycle reference are linked by absolute URL as optional deeper reading, so a lone skill folder works on its own.

skills.sh (third-party installer):

npx skills add sleeplessv/relentless-data-skills --skill <name>   # one skill
npx skills add sleeplessv/relentless-data-skills                  # the whole set

Claude Code plugin (the whole set as one selectable plugin):

/plugin marketplace add sleeplessv/relentless-data-skills
/plugin install relentless-data-skills@relentless-data-skills

Pick the single relentless-data-skills plugin and all skills install together; reload with /reload-plugins.

Local development. The plugin path works from a local clone too: /plugin marketplace add <path-to-clone> then /plugin install relentless-data-skills@relentless-data-skills, reloaded with /reload-plugins. The repo ships both .claude-plugin/plugin.json (the plugin) and .claude-plugin/marketplace.json (the single-plugin marketplace pointing at it).

v1.5 skills

Stage 0 — Requirements & Discovery:

  • /gather-requirements — interrogates a stakeholder ask into one or more Decision-Grade Question (DGQ) documents under docs/requirements/; enforces the action-change kill-switch (if nothing would change when the answer flips, the ask is rejected, not built) and bundles its own contract (references/dgq.md) and linter (scripts/lint-dgq.sh) inside the skill folder for schema enforcement (ADR-0004 pilot).
  • /write-adr — scaffolds a MADR-lite Architecture Decision Record under docs/adr/ for cross-cutting platform commitments; refuses single-option decisions and bundles its own contract (references/adr.md) and linter (scripts/lint-adr.sh) inside the skill folder for structural enforcement (ADR-0004).

Stage 1 — Generation:

  • /profile-source — interrogates an upstream source (database table, SaaS endpoint, streaming topic, partner feed) into a Source Profile document under docs/sources-of-record/; enforces the Generation kill-switch (an accepted profile must name at least one consumer DGQ in consumed_by — no consumer, no profile) and bundles its own contract (references/source-profile.md) and linter (scripts/lint-source-profile.sh) inside the skill folder for schema enforcement (ADR-0004).

Stage 2 — Ingestion:

  • /plan-ingestion — interrogates a planned (or undocumented) ingestion path into an Ingestion Plan document under docs/ingestion-plans/; enforces the Ingestion kill-switch (an accepted plan must name at least one upstream Source Profile in related_sources_of_record — no profiled upstream, no plan) and bundles its own contract (references/ingestion-plan.md) and linter (scripts/lint-ingestion-plan.sh) inside the skill folder for schema enforcement (ADR-0004).

Stage 3 — Storage:

  • /plan-storage — interrogates a planned (or undocumented) storage destination into a Storage Plan document under docs/storage-plans/; enforces the Storage kill-switch (an accepted plan must name at least one upstream Ingestion Plan in related_ingestion_plans — no planned upstream, no plan) and bundles its own contract (references/storage-plan.md) and linter (scripts/lint-storage-plan.sh) inside the skill folder for schema enforcement (ADR-0004). Warehouse, lakehouse, object-store, operational-store, search-index, and cache are equally first-class; the interview branches on the user's answer rather than defaulting to any.

Stage 4 — Transformation:

  • /model-data — interrogates a planned (or undocumented) transformation into a Model Plan document under docs/model-plans/; enforces the Transformation kill-switch (an accepted plan must name at least one upstream Storage Plan in related_storage_plans — a model that does not chain back to governed storage is ungoverned business logic, not a real table) and bundles its own contract (references/model-plan.md) and linter (scripts/lint-model-plan.sh) inside the skill folder for schema enforcement (ADR-0004). Grain is asked first and must resolve to a concrete cardinality; assertions forces an honest business-invariant/shape-only/none posture; staging/intermediate/mart/metric/feature layers and streaming/batch materialisation are equally first-class.

Stage 5 — Serving:

  • /serve-data — interrogates a planned (or undocumented) serving endpoint into a Serving Surface document under docs/serving-surfaces/; enforces the dual Serving kill-switch — the differentiator — where an accepted surface must name both an upstream Model Plan in related_model_plans (the chain half — no governed model means the surface recomputes business logic from raw/staging) and a DGQ it informs in serves_dgqs (the loop-closure half — a surface that informs no decision is the dashboard-nobody-uses, retired exactly as the ask-nobody-acts-on is killed at Stage 0), and bundles its own contract (references/serving-surface.md) and linter (scripts/lint-serving-surface.sh) inside the skill folder for schema enforcement (ADR-0004). access_model forces an honest convention-vs-role-based posture; dashboard/api/reverse-etl/feature-store/export/embedded surfaces and streaming/batch delivery are equally first-class.

This is the stage that closes the lineage loop: DGQ → Source Profile → Ingestion Plan → Storage Plan → Model Plan → Serving Surface → DGQ. With it, the Reis-5-plus-Stage-0 spine ships end to end.

Cross-cutting skills (meta) — outside the numbered spine, per ADR-0002:

  • /triage-lineage — reads the consumer's repo-internal artefact graph (DGQs, Source Profiles, Ingestion Plans, Storage Plans, Model Plans, Serving Surfaces, ADRs under docs/) and prints a prioritised, two-tier report — Tier 0 dangling references first (these pass every linter today), then real gaps ranked by distance to a live decision — routing each finding into the stage skill that closes it. It is an advisor (meta) skill: it operates over the artefact graph, authors no durable artefact of its own, and has no linter. It propagates the action-change kill-switch across the graph — a gap that no live decision pulls on is routed to retire / record rejected, never to build — so it cannot regress into a flat "fix everything" coverage list. Repo-internal self-audit only (no dbt-manifest, live-warehouse, or orchestrator scans).
  • /setup-relentless-data-skills — a one-time setup ritual (not an advisor): scaffolds a consumer project to use the lifecycle skills. Writes a thin ## Data skills index block into the project's CLAUDE.md/AGENTS.md and creates only the docs/<stage-area>/ directories for the stages the project will actually populate — scaffold only what will hold content. It teaches each stage's kill-switch in the block it writes, is idempotent, never clobbers, leaves any ## Agent skills block untouched, and is disable-model-invocation (run it explicitly). Modelled on the setup-matt-pocock-skills scaffold pattern. It's a second kind of non-spine meta/ skill (a setup / scaffold skill) alongside the advisor skills — see skills/meta/README.md for why that broadening is recorded in the living docs rather than its own ADR.

Stage status

| Stage | Name | v1.5 status | | --- | --- | --- | | 0 | Requirements & Discovery | ✅ shipped | | 1 | Generation | ✅ shipped | | 2 | Ingestion | ✅ shipped | | 3 | Storage | ✅ shipped | | 4 | Transformation | ✅ shipped | | 5 | Serving | ✅ shipped |

The spine is complete — every stage from the founding question to the consumer-facing surface now ships a skill, an artefact, and a linter. The lifecycle spine follows the Reis 5-stage taxonomy (Generation → Ingestion → Storage → Transformation → Serving), with Stage 0 prepended for upstream-of-code clarity. Streaming and batch are equally first-class; the repo refuses to bias toward either.

Repo layout

relentless-data-skills/
├─ README.md
├─ CONTEXT.md                       repo vocabulary, principles, conventions
├─ CLAUDE.md                        Claude-Code-specific pointer + guardrails
├─ LICENSE                          MIT
├─ .claude-plugin/plugin.json       plugin manifest (skills declared here)
├─ docs/
│   ├─ data-engineering-101.md      distilled lifecycle reference
│   ├─ adr/                         Architecture Decision Records (written by /write-adr)
│   ├─ agents/                      agent-skill docs (issue tracker, triage labels, domain)
│   ├─ evals.md                     skill eval-suite spec + run instructions
│   └─ sources/                     source provenance (Redpanda capture + URL index)
└─ skills/
    ├─ 00-requirements/             /gather-requirements, /write-adr
    ├─ 01-generation/               /profile-source
    ├─ 02-ingestion/                /plan-ingestion
    ├─ 03-storage/                  /plan-storage
    ├─ 04-transformation/           /model-data
    ├─ 05-serving/                  /serve-data
    └─ meta/                        /triage-lineage, /setup-relentless-data-skills  (cross-cutting; no lifecycle position — ADR-0002)

Stage folders use zero-padded two-digit numeric prefixes so order is encoded in the listing and new stages can be inserted without renumbering. Skill folder names are verb-led and kebab-case, matching their command form (/gather-requirements, not /requirements-gathering). Cross-cutting skills with no lifecycle position — advisor skills (ADR-0002) and setup / scaffold skills — live in the non-numbered skills/meta/ folder. Each skill folder additionally carries its own references/<artefact>.md (the contract) and scripts/lint-<artefact>.sh (the linter), so it installs self-contained (ADR-0004).

Read next

  • CONTEXT.md — repo vocabulary, principles, conventions. The substance lives here; the README is the front door.
  • docs/data-engineering-101.md — distilled, opinionated lifecycle reference. Stage definitions, failure modes, undercurrents, glossary.
  • docs/sources/ — source provenance for the lifecycle reference (Redpanda capture, annotated URL index, refresh procedure).

Acknowledgements

Structural inspiration: mattpocock/skills. This is a clean-room implementation with an independent lifecycle taxonomy and domain-specific content, but the folder shape and the "skills as a Claude Code plugin" mechanic come from there. Thanks Matt.

Licence

MIT — see LICENSE.

Related plugins

Browse all →