Nova Labs is currently on pause. New product purchases are unavailable. The blog remains live as an archive of the experiment.
Back to blog

How to count tokens in Claude Code (and why you should)

April 7, 2026 8 min read

Claude Code does not have a token counter. There is no sidebar widget, no usage bar, no "tokens remaining" indicator. You work blind until you hit a rate limit or get a bill that looks wrong.

This is the number one frustration for developers and solopreneurs who use Claude Code daily. You are spending $100-200 per month on a tool and you cannot see what you are paying for in real time.

The good news: the data exists. Claude Code logs every session in detail. The challenge is getting that data into a form you can actually read.

Where Claude Code stores token data

Every Claude Code session generates a JSONL log file. These files live in your home directory under ~/.claude/projects/, organized by project path. Each line in the file is a JSON object containing the message, role, model used, and token counts.

The relevant fields are:

  • input_tokens: How many tokens were sent to the model (your message + full conversation context + file contents + tool definitions)
  • output_tokens: How many tokens Claude generated in response
  • cache_creation_input_tokens: Tokens written to the prompt cache (charged at 1.25x the normal input rate on Opus)
  • cache_read_input_tokens: Tokens served from cache (charged at 0.1x the normal input rate)

These four numbers tell you everything about what a session costs. The problem is that a single day of active use can generate hundreds of megabytes of JSONL across dozens of session files.

The manual approach (and why it breaks down)

You can write a quick script to parse JSONL and sum up tokens:

import json
import glob

total_input = 0
total_output = 0

for f in glob.glob("~/.claude/projects/**/session*.jsonl"):
    for line in open(f):
        msg = json.loads(line)
        if "usage" in msg:
            total_input += msg["usage"].get("input_tokens", 0)
            total_output += msg["usage"].get("output_tokens", 0)

print(f"Input: {total_input:,} tokens")
print(f"Output: {total_output:,} tokens")

This gets you a raw total, but it misses the important details. Which sessions were expensive? Which projects consume the most? Where are tokens being wasted on cache misses vs. used efficiently? Is one model driving most of the cost?

Once you start asking these questions, a simple script turns into a project. You need to track sessions over time, split by model (Opus vs. Sonnet vs. Haiku), calculate actual dollar costs per tier, and identify patterns.

What token counts actually reveal

After analyzing over 30 days of production Claude Code usage, here are the patterns that matter:

Input tokens dominate your bill

In a typical session, 80-90% of tokens are input. Claude reads your files, loads context, processes your conversation history. All of that is input. The code it writes back is output, and it is the minority of your total usage. Most people assume output is the expensive part. It is not.

Long sessions are exponentially expensive

Message 1 in a session might use 5,000 input tokens. Message 10 uses 50,000. Message 20 uses 150,000+. Every message includes the full conversation history. Token cost per message is not linear, it is compounding.

Cache hits save 90% on input costs

Anthropic's prompt cache means that repeated context (CLAUDE.md, previously read files) is cached and charged at 10% of the normal rate. If your cache hit rate is low, you are paying 10x more than necessary for the same context. A healthy session has 60-80% cache hit rates. Below 40% means something is wrong with your session structure.

Model choice changes everything

Opus costs $15/$75 per million input/output tokens. Sonnet costs $3/$15. Haiku costs $0.25/$1.25. A task done on Opus that could have been done on Sonnet costs 5x more. A task done on Opus that could have been done on Haiku costs 60x more. Without tracking which model handles which tasks, you cannot optimize this.

A faster way to count your tokens

We built a free token analyzer for Claude Code that handles all of this. Drop your JSONL session files in, and it gives you:

  • Total tokens by type (input, output, cache creation, cache read)
  • Dollar cost breakdown by model tier
  • Per-session analysis showing which sessions were expensive and why
  • Cache efficiency percentage so you know if caching is working
  • Per-file breakdown showing which files consume the most tokens
  • Historical tracking so you can see cost trends over days and weeks
  • Model cost optimizer that shows savings if you routed tasks to cheaper models

It runs entirely in your browser. No data leaves your machine. No account needed. Your session logs contain sensitive code, so privacy is not optional here.

What to do once you have the numbers

Counting tokens is not the goal. Spending less while getting the same output is the goal. Once you can see your token breakdown, the fixes are usually obvious:

  • Cache hit rate below 50%? You are starting too many fresh sessions or your context files change too often. Restructure to keep stable context in separate files.
  • One project uses 3x the tokens of another? Check file sizes and session length in that project. Large files loaded repeatedly are the usual culprit. See our guide on tracking costs per project for a full breakdown.
  • Opus handling simple tasks? Route formatting, lookups, and simple edits to Sonnet or Haiku. Save Opus for complex reasoning.
  • Sessions averaging 20+ messages? Break them into shorter sessions. The compounding context cost makes long sessions disproportionately expensive.

Most users find 30-50% of their token usage is avoidable once they can see where it goes. That is $30-100 per month on a Max plan, or hundreds of dollars on API billing.

The bottom line

You cannot optimize what you cannot measure. Claude Code gives you no built-in way to count tokens, but the data is sitting in your session logs. Parse it manually, build a script, or use our free analyzer to get the breakdown instantly. Either way, start counting. The numbers will surprise you.

If you want to go deeper, read about the five hidden cost drivers that make most Claude Code bills higher than expected. And if you need ongoing cost tracking with budget alerts and historical trends, CostPilot is built for exactly that. Join the waitlist for early access.

Want to build your own AI OS?

The AI OS Blueprint gives you the complete system: 53-page playbook, working skills, and a clonable repo. Starting at $47.

30-day money-back guarantee. No subscription.