Claude Skill

Skill Evaluator

Audit all skills in the current project for frontmatter completeness, effort level appropriateness, allowed-tools scoping, and content quality. Produces a scored report with effort-level recommendations for each skill. Use when onboarding to a new project, reviewing skill quality before shipping, or adding effort fields to an existing skill library.

Reviewed community sourceInstallable5 sections3 related pages

Editor's Note

Audit all skills in the current project for frontmatter completeness, effort level appropriateness, allowed-tools scoping, and content quality. Produces a scored report with effort-level recommendations for each skill. Use when onboarding to a new project,... Covers when to use, what gets audited, scoring criteria (14 pts per skill).

Editorial Guide

What to do with this skill

Start with the workflow below, then drop into the upstream source only after the page has narrowed the job for you.

What this skill does

Audit all skills in the current project for frontmatter completeness, effort level appropriateness, allowed-tools scoping, and content quality. Produces a scored report with effort-level recommendations for each skill.

When to use it

onboarding to a new project, reviewing skill quality before shipping, or adding effort fields to an existing skill library.

Install and setup notes

  • Open the upstream source before treating this page as install-ready, because not every official record is meant to be dropped into a workflow unchanged.
  • Keep the context narrow. These skills are usually strongest when you load only the branch, reference set, or workflow step that matches the current task.
  • If you plan to standardize on this skill for team use, pin the upstream repo and check for updates periodically instead of assuming the official defaults are static.

Example workflow

  1. Start with a concrete task that clearly matches this skill's intended trigger: onboarding to a new project, reviewing skill quality before shipping, or adding effort fields to an existing skill library.
  2. Read the overview and first source section, then choose the smallest branch of guidance or references that solves the task in front of you.
  3. Run the change on a real file, command, or workflow, verify the result, and only then widen the skill into a repeatable team pattern.

Compatible agents

This skill is explicitly marked for Claude Code.

Claude Code

Install source

This page does not expose a single copy-paste install command in the normalized record. Use the upstream install source below to confirm the exact steps, file paths, and current setup expectations before you add it to your stack.

Page Outline

When to UseWhat Gets AuditedScoring Criteria (14 pts per skill)Effort Level Inference EngineExecution Instructions

Source Content

Normalized top-level metadata comes from the directory layer. The body below is the upstream source content for this item.

Skill Evaluator

Discover all skills in the project, score them across 6 criteria, and infer the appropriate `effort` level based on content analysis.

When to Use

  • New project: run once to establish baseline quality
  • Before committing a skill to a team repo
  • After bulk-importing skills from another project
  • When adding `effort` fields for the first time (v2.1.80+)

What Gets Audited

All `SKILL.md` files and flat `.md` files found in:

  • `.claude/skills/**`
  • `~/.claude/skills/**` (if requested)
  • Any path passed as argument: `/eval-skills ./my-skills-dir`

---

Scoring Criteria (14 pts per skill)

| # | Criterion | Max | What is checked | |---|-----------|-----|-----------------| | 1 | **name** | 1 | Present, lowercase, hyphens only, matches directory name | | 2 | **description** | 2 | Present + has "Use when" / "when to" / trigger phrasing | | 3 | **allowed-tools** | 2 | Present + not overly broad (Bash without scoping when read-only) | | 4 | **effort** | 3 | Present (1pt) + appropriate for content (2pt based on inference) | | 5 | **content structure** | 4 | Has Purpose/When section (1), has examples/usage (1), has clear workflow (1), no placeholder text (1) | | 6 | **bonus** | +2 | argument-hint present (1), version/author metadata (1) |

> **Note**: `tags` is NOT an officially supported frontmatter field in Claude Code. It is ignored by the runtime. Do not include it or score it as a quality criterion.

**Thresholds:**

  • ✅ Good: ≥11/14 (≥80%)
  • ⚠️ Needs work: 8–10/14 (60–79%)
  • ❌ Fix: <8/14 (<60%)

---

Effort Level Inference Engine

For each skill, analyze description + content and classify using these signals:

`low` — Mechanical execution, no design decisions

Signals:

  • Verbs: commit, push, sync, scaffold, generate (template-based), format, rename, bump, wrap, convert
  • No reasoning required: sequential steps, template instantiation, data fetching
  • allowed-tools: Bash only, or Read-only
  • No sub-agents spawned
  • Short workflow (<5 steps)

Examples: `/commit`, `/release-notes`, `/scaffold`, `/sync`, `/format`

`medium` — Analysis with bounded scope, categorization

Signals:

  • Verbs: review, triage, analyze, categorize, suggest, evaluate (single file or bounded scope)
  • Requires pattern recognition but not architectural reasoning
  • allowed-tools: Read + Grep + Bash combination
  • May spawn 1-2 sub-agents but with predefined scope
  • Produces structured output (tables, categorized lists)

Examples: `/code-review` (single PR), `/issue-triage`, `/dependency-audit`, `/test-coverage`

`high` — Design decisions, adversarial reasoning, cross-system analysis

Signals:

  • Verbs: architect, redesign, threat-model, audit (security), orchestrate (multi-agent), score, assess trade-offs
  • Requires reasoning about edge cases, attack vectors, or system-wide implications
  • allowed-tools: broad access (Read + Write + Bash + external tools)
  • Spawns multiple sub-agents or uses parallel execution
  • Produces analysis with explicit uncertainty or trade-off sections
  • Keywords in content: "security", "architecture", "adversarial", "pipeline", "threat", "design decision"

Examples: `/security-audit`, `/architecture-review`, `/cyber-defense`, `/eval-agents`

Mismatch flag

If a skill has `effort:` already set but the inferred level differs, flag it: > ⚠️ Effort mismatch: declared `low`, inferred `high` — skill spawns 4 sub-agents and performs security analysis

---

Execution Instructions

Step 1 — Discovery

# Find all SKILL.md files
find .claude/skills -name "SKIL

<!-- truncated -->

Recommended skills

Next places to browse

Sponsored
MoltAwards: Turn AI agents loose on government contracts & jobs! logo

Turn AI agents loose on government contracts

Learn more