Claude Skill

Skill Creator

Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.

Official sourceInstallable2 sections3 related pages

Editor's Note

Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or... Covers communicating with the user, creating a skill.

Editorial Guide

What to do with this skill

Start with the workflow below, then drop into the upstream source only after the page has narrowed the job for you.

What this skill does

Create new skills, modify and improve existing skills, and measure skill performance.

When to use it

users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.

Install and setup notes

  • Install it with the command shown below, then read the upstream source file before making it part of a shared Claude Code workflow.
  • Keep the context narrow. These skills are usually strongest when you load only the branch, reference set, or workflow step that matches the current task.
  • If you plan to standardize on this skill for team use, pin the upstream repo and check for updates periodically instead of assuming the official defaults are static.

Example workflow

  1. Start with a concrete task that clearly matches this skill's intended trigger: users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
  2. Read the overview and first source section, then choose the smallest branch of guidance or references that solves the task in front of you.
  3. Run the change on a real file, command, or workflow, verify the result, and only then widen the skill into a repeatable team pattern.

Compatible agents

This skill is explicitly marked for Claude Code.

Claude Code

Install

npx skills add https://github.com/anthropics/skills --skill skill-creator

Page Outline

Communicating with the userCreating a skill

Source Content

Normalized top-level metadata comes from the directory layer. The body below is the upstream source content for this item.

Skill Creator

A skill for creating new skills and iteratively improving them.

At a high level, the process of creating a skill goes like this:

  • Decide what you want the skill to do and roughly how it should do it
  • Write a draft of the skill
  • Create a few test prompts and run claude-with-access-to-the-skill on them
  • Help the user evaluate the results both qualitatively and quantitatively
  • While the runs happen in the background, draft some quantitative evals if there aren't any (if there are some, you can either use as is or modify if you feel something needs to change about them). Then explain them to the user (or if they already existed, explain the ones that already exist)
  • Use the `eval-viewer/generate_review.py` script to show the user the results for them to look at, and also let them look at the quantitative metrics
  • Rewrite the skill based on feedback from the user's evaluation of the results (and also if there are any glaring flaws that become apparent from the quantitative benchmarks)
  • Repeat until you're satisfied
  • Expand the test set and try again at larger scale

Your job when using this skill is to figure out where the user is in this process and then jump in and help them progress through these stages. So for instance, maybe they're like "I want to make a skill for X". You can help narrow down what they mean, write a draft, write the test cases, figure out how they want to evaluate, run all the prompts, and repeat.

On the other hand, maybe they already have a draft of the skill. In this case you can go straight to the eval/iterate part of the loop.

Of course, you should always be flexible and if the user is like "I don't need to run a bunch of evaluations, just vibe with me", you can do that instead.

Then after the skill is done (but again, the order is flexible), you can also run the skill description improver, which we have a whole separate script for, to optimize the triggering of the skill.

Cool? Cool.

Communicating with the user

The skill creator is liable to be used by people across a wide range of familiarity with coding jargon. If you haven't heard (and how could you, it's only very recently that it started), there's a trend now where the power of Claude is inspiring plumbers to open up their terminals, parents and grandparents to google "how to install npm". On the other hand, the bulk of users are probably fairly computer-literate.

So please pay attention to context cues to understand how to phrase your communication! In the default case, just to give you some idea:

  • "evaluation" and "benchmark" are borderline, but OK
  • for "JSON" and "assertion" you want to see serious cues from the user that they know what those things are before using them without explaining them

It's OK to briefly explain terms if you're in doubt, and feel free to clarify terms with a short definition if you're unsure if the user will get it.

---

Creating a skill

Capture Intent

Start by understanding the user's intent. The current conversation might already contain a workflow the user wants to capture (e.g., they say "turn this into a skill"). If so, extract answers from the conversation history first — the tools used, the sequence of steps, corrections the user made, input/output formats observed. The user may need to fill the gaps, and should confirm before proceeding to the next step.

  • What should this skill enable Claude to do?
  • When should this skill trigger? (what user phrases/contexts)
  • What's the expected output format?
  • Should we set up test cases to verify the skill works? Skills with objectively verifiable outputs (f

<!-- truncated -->

Recommended skills

Next places to browse

Sponsored
MoltAwards: Turn AI agents loose on government contracts & jobs! logo

Turn AI agents loose on government contracts

Learn more