How to score your CLAUDE.md from the terminal (one command, no install)
You can score your CLAUDE.md in a browser with the web analyzer. But if you live in the terminal (which, if you use Claude Code, you probably do), opening a browser to paste a config file feels like going backwards.
ContextKit now has a CLI. One command, zero install, instant results.
The command
npx contextkit score
That is it. Run it in any project directory and it auto-detects your config file: CLAUDE.md, .cursorrules, AGENTS.md, GEMINI.md, or any of their common locations like .claude/CLAUDE.md.
The output looks like this:
─────────────────────────────────────────
CLAUDE.md Score 7/10 (Good)
./CLAUDE.md
─────────────────────────────────────────
Solid config with room for improvement.
Categories
Structure ████████████████ 2/2
Architecture ████████████████ 2/2
Conventions ████████████ 1.5/2
Testing ████████████ 1.5/2
Guardrails ████████ 1/2
Improvements
! Add security rules (XSS, injection, etc.)
! Add scope constraints like "Keep changes minimal"
✓ Good structure with clear sections
✓ Tech stack and file structure documented
✓ Testing framework and commands documented
───────────────────────────────────────── Five categories, each scored out of 2. The total score out of 10 tells you at a glance how much guidance your AI assistant actually has. Below 5, Claude is mostly guessing. Above 7, it has enough context to be genuinely useful.
Scoring a specific file
If your config file is not in the default location, pass the path directly:
npx contextkit score path/to/CLAUDE.md
npx contextkit score .cursorrules
npx contextkit score configs/team-rules.md Or pipe content from stdin:
cat CLAUDE.md | npx contextkit score --stdin The stdin mode is useful if you want to score a file that is not on disk yet, or if you are generating configs dynamically and want to check the output quality.
What the five categories measure
The scoring engine checks five areas that directly affect how well your AI assistant understands your project:
Structure (0-2). Does the file have markdown headings? Are there enough sections? Is it long enough to be useful? A flat file with no headings means the AI has to guess which section applies to what.
Architecture (0-2). Does the file mention the tech stack? Is there a file structure section? Does it describe what the project is? Without this, Claude writes Python when your project is TypeScript.
Conventions (0-2). Are there coding rules? Do-nots? Naming conventions? Import patterns? This is where you tell Claude "never add docstrings to obvious functions" and "use snake_case for Python variables." Without explicit rules, Claude defaults to its own style.
Testing (0-2). Does the file mention a test framework? Is there a run command? Test strategy? File locations? If Claude does not know you use Vitest, it will write Jest tests. If it does not know the test command, it cannot verify its own work.
Guardrails (0-2). Are there security rules? Scope constraints? A "read existing code first" policy? These prevent Claude from making destructive changes, adding unnecessary features, or ignoring your existing patterns.
Using it in CI
The CLI returns exit code 0 for scores of 5 or higher, and exit code 1 for anything below. This means you can add it to your CI pipeline to enforce config quality:
# .github/workflows/lint-config.yml
name: Lint AI Config
on: [pull_request]
jobs:
score:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npx contextkit score If someone removes the testing section from your CLAUDE.md or strips out the guardrails, CI fails. This is especially useful for teams where multiple people edit the config file and you want to maintain a minimum quality bar.
You can also score multiple files in a monorepo:
npx contextkit score CLAUDE.md
npx contextkit score packages/frontend/.cursorrules
npx contextkit score packages/api/CLAUDE.md How the scoring works
The scoring engine uses the same heuristics as the web analyzer. It scans the content for specific signals: are there markdown headings? Does the text mention a framework name? Does it contain "do not" or "never" rules? Does it reference a test runner?
It is not perfect. A config file that says "do not use TypeScript" would score points for both the tech stack mention and the do-not rule, even though the content is contradictory. But for the vast majority of real config files, the heuristics catch the difference between a thoughtful config and a placeholder.
Everything runs locally. The CLI reads the file from disk, runs the scoring engine in Node.js, and prints the result. Nothing is uploaded, no network requests are made, no telemetry is collected.
From score to fix
If your score is below 7, the improvements section tells you exactly what to add. Each suggestion maps to a specific category:
- "Add markdown headings" = Structure
- "Mention your tech stack" = Architecture
- 'Add "do not" rules' = Conventions
- "Specify your test framework" = Testing
- "Add security rules" = Guardrails
If you do not want to write the missing sections by hand, the ContextKit generator creates a complete config in 30 seconds. Generate one, diff it against your existing file, and merge the sections you are missing.
Get started
npx contextkit score No install, no account, no setup. If you get a score you are proud of, grab a README badge for your repo. It links back to the analyzer so your contributors can score their own configs.
Source code is on GitHub (MIT licensed). Issues and PRs welcome.
You might also like
Want to build your own AI OS?
The AI OS Blueprint gives you the complete system: 53-page playbook, working skills, and a clonable repo. Starting at $47.
30-day money-back guarantee. No subscription.