Question 1

What is Auto Skill Improver for Claude Code?

Accepted Answer

Auto Skill Improver for Claude Code is an open-source tool that applies benchmark-driven iteration to your CLAUDE.md files and project instructions. It classifies your skill type, builds a test suite, establishes a baseline score, then systematically mutates one instruction at a time — keeping only changes that measurably improve Claude Code performance.

Question 2

How does it improve CLAUDE.md files specifically?

Accepted Answer

It treats your CLAUDE.md as a testable artefact. The tool generates scenarios that exercise your instructions, measures Claude Code's output quality against defined criteria, then makes targeted changes — one at a time — to find which instruction tweaks produce measurably better results.

Question 3

What types of Claude Code skills can it improve?

Accepted Answer

Any skill type — coding assistants, code reviewers, test generators, documentation writers, refactoring agents, and more. The tool classifies the skill type automatically from your CLAUDE.md and builds appropriate benchmarks for that category.

Question 4

How do benchmarks work for Claude Code instructions?

Accepted Answer

A benchmark is a structured test suite with defined inputs and pass/fail criteria. For a coding skill, this might be 'generate a function that passes these unit tests'. For a reviewer skill, it might be 'identify the three bugs in this code'. The benchmark runs the same tests before and after each CLAUDE.md change.

Question 5

Do I need to write benchmarks manually?

Accepted Answer

No. Auto Skill Improver generates benchmarks automatically based on your skill type and CLAUDE.md contents. It analyses what your instructions are trying to achieve and builds appropriate test scenarios. You can also provide custom benchmarks if you have specific requirements.

Question 6

What does 'benchmark saturation' mean for Claude Code?

Accepted Answer

Benchmark saturation occurs when successive CLAUDE.md mutations stop producing score improvements. Your instructions have reached the ceiling of what the current benchmark can measure. You can either accept the current performance or create a harder benchmark that tests more advanced scenarios.

Question 7

How is this different from manually editing my CLAUDE.md?

Accepted Answer

Manual editing is editorial: you rewrite instructions, they sound clearer, you keep them. But 'sounds clearer' isn't evidence. Auto Skill Improver is empirical: it establishes a baseline, changes one instruction at a time, re-runs the same benchmark, and keeps only what scores higher. It catches problems invisible during manual editing.

Question 8

Is it free and open source?

Accepted Answer

Yes. Auto Skill Improver is fully open source and free to use. The source code is available on GitHub at github.com/mlobo2012/auto-skill-improver. There are no usage limits, no API keys required for the tool itself, and no premium tiers.

Auto Skill Improver for Claude Code — Benchmark-Driven Skill Optimisation

How It Works with Claude Code

Get the Guide File

Step-by-Step: Set Up Auto Skill Improver in Claude Code

Download the quickstart file above

Open Claude Code

Upload the quickstart file

Point it at your CLAUDE.md or any instruction file

Review the baseline score

Let it run mutations

Done when the benchmark saturates

Why Most Claude Code Skill Iteration Fails

Instruction Vibes

Instruction Science

What It Finds in Claude Code Skills

Ambiguous Output Contracts

Missing Fallback Behaviour

Conflicting Instruction Layers

Dependency & Portability Problems

Weak Evidence Discipline

Structural Formatting Issues

The Karpathy-Inspired Method

Classify the Skill Type

Build a Real Benchmark

Establish a Baseline

Mutate One Thing at a Time

Keep Only What Improves

Stop When the Benchmark Saturates

When to Use It — and When Not To

Best For

Not the Right Fit

Frequently Asked Questions

Also Available For

Stop Guessing. Start Measuring.