Claude Code invisible رمز burn (April 2026): what's happening and how to detect it
- Multiple users on r/ClaudeAI and the Anthropic commوحدةy forum describe roughly 20,000 extra رمزs per request appearing in their billing logs that they never sent.
- This inflates session التكلفةs without any visible change in the Claude Code interface — your
/contextreadout and your actual bill tell different stories. - Four runnable detection patterns are below. If your API-side رمز counts are diverging from local
ccusageestimates, you are likely affected.
1. What's happening
Starting in early April 2026, a cluster of developers began noticing that their Anthropic API spend was climbing faster than their actual work could explain. The divergence wasn't random — it was consistent, roughly proportional to request volume, and invisible in the Claude Code UI.
Coverage from Efficienist (April 2026) described the pattern as server-side رمز inflation: requests were arriving at Anthropic's API carrying a significantly larger رمز footprint than the client-side session data accounted for. The publication noted that affected users could reproduce the gap reliably by comparing their local ccusage الإجماليs against their Anthropic dashboard in the same 15-minute window.
DevClass (April 1, 2026) reported that developers were observing "consistent billing anomalies tied to request counts rather than context length," suggesting the inflation was applied per-call rather than scaling with actual prompt size. This per-call characteristic is what makes it particularly expensive for وكيلic سير العملs with high tool-call frequency.
The sharpest user signal came from r/ClaudeAI, where one thread accumulated significant upvotes around a single observation:
"Every request appears to carry around 20,000 extra رمزs that the user never sent."
— r/ClaudeAI, April 2026Separately, threads on the Anthropic commوحدةy forum documented users correlating timestamp-specific rate changes: التكلفةs that spiked not when context grew, but after a specific date in early April, holding roughly flat at the elevated rate regardless of prompt complexity. Multiple users in these threads noted the issue appeared after an Anthropic server-side تحديث and was not reproducible on API clients that bypassed the Claude Code CLI layer.
To be direct about what we do and don't know: Anthropic has not صدر a public statement confirming this as an infrastructure bug. What we have is a consistent pattern across independent user reports. The mechanism — whether it's system prompt inflation, caching metadata, or something in the tool-use scaffolding — is not yet publicly documented. We're reporting what's observable, not what's confirmed.
2. How to detect it in your own logs
Four runnable patterns. Each takes under خمس دقائق. Together they tell you whether you're affected and, if so, the approximate magnitude.
Pattern 1: ccusage diff against the Anthropic dashboard
التثبيت ccusage if you haven't already (npm i -g ccusage). Run a short Claude Code session — a single file edit, a question, anything with at least five tool calls. Then immediately compare:
# Local estimate from session files
ccusage daily --date today
# Then open: console.anthropic.com → Usage → filter to the last 30 minutes
A healthy session shows ccusage within 5–10% of the dashboard (the gap is normal caching overhead). If the dashboard is showing 1.5x or more what ccusage estimates, you have a divergence worth investigating. Users in the r/ClaudeAI thread were describing multipliers of 1.3x to 2.1x on affected accounts.
Pattern 2: /context inspection per request
During an active Claude Code session, type /context to see the current window usage. Note the رمز count. Then run a single simple tool call (e.g., read one small file). Run /context again immediately after. The delta should approximate: file رمزs + response رمزs + modest overhead.
# Before tool call
/context
# → note "X tokens used"
# Run: read one 50-line file
# After tool call
/context
# → new count should be X + ~300–500 for a 50-line file
If the delta is 20,000+ رمزs on a trivial file read, you're looking at the same inflation pattern described in commوحدةy reports. The /context view reflects the client-side state — if this is also inflated, the overhead is being injected before the context window is assembled on your machine.
Pattern 3: API-side رمز count via direct مقارنة
Run the identical prompt twice — once via the Claude Code CLI, once directly via the Anthropic API using the same model. Compare usage.input_tokens in the raw API response.
# Direct API call (Python, uses anthropic SDK)
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=100,
messages=[{"role": "user", "content": "Say hello."}]
)
print(response.usage)
# → input_tokens: ~12 (the actual prompt)
# Now run the equivalent in Claude Code and check ccusage:
# ccusage session --last 1 | grep input_tokens
A significant gap between direct API input_tokens and Claude Code session input_tokens for an equivalent prompt points to overhead being added by the CLI layer's scaffolding — system prompt, tool definitions, or session context injected server-side.
Pattern 4: Rate-change timestamp correlation
Pull your Anthropic billing API or export your usage CSV from the console. Plot daily رمز spend per session (not per message) against calendar date. If you see a step-change in التكلفة-per-session that is not correlated with a change in your own prompting behavior, and if that step-change occurred in the first week of April 2026, it's consistent with the server-side تحديث window described in commوحدةy reports.
# From Anthropic console: Usage → Export CSV
# In your terminal (requires csvkit or similar):
csvstat --csv usage_export.csv | grep -A3 "tokens_input"
# Or in Python:
import pandas as pd
df = pd.read_csv("usage_export.csv", parse_dates=["date"])
df.groupby("date")["input_tokens"].sum().plot()
A clean visual step-change on or around April 1–3, 2026 with no corresponding change in your codebase or prompting patterns is the clearest signal that the inflation is external to your سير العمل.
Running وكيلs overnight? The رمز burn compounds fast.
Septim وكيل فرعي التكلفة Guard is a PreToolUse hook that hard-halts وكيلs mid-flight when spend crosses your threshold. $29 سعر التأسيس, 50 seats, ships May 2026. Zero-dollar reservation now.
3. What to do about it right now
Three actions you can take اليوم. None require waiting for Anthropic to issue a fix.
Action 1: Lower concurrency on multi-وكيل sessions
If the overhead is per-request, the most direct التكلفة lever is request volume. Reduce the number of parallel وكيل فرعيs in your current سير العملs. A session running 4 وكيلs in parallel instead of 12 doesn't cut output by 66% — most of that parallelism is latency optimization, not throughput — but it does cut the number of inflated requests by 66%.
In your CLAUDE.md, add a temporary directive:
# Temporary cost constraint (April 2026)
# Due to active token inflation reports, limit parallel subagent
# spawns to a maximum of 3 concurrent at any time.
# Prefer sequential tool calls over parallel where latency allows.
Action 2: Disable autocompact during the investigation window
Claude Code's autocompact feature summarizes long conversations to keep context manageable. When autocompact fires, it generates a new summary request — which, under current conditions, carries the same per-request overhead. Disabling it prevents those compaction requests from adding to the inflated bill while the situation is active.
In your Claude Code settings (~/.claude/settings.json):
{
"autoCompact": false
}
Note: disabling autocompact means long sessions may run into context limits. Break long sessions manually rather than relying on autocompact to rescue them.
Action 3: Set an hourly spend alert in the Anthropic console
Anthropic's billing alerts default to monthly thresholds. That's too slow a feedback loop for an active per-request inflation issue. Set a tight hourly alert to catch runaway spend before it compounds:
- Go to console.anthropic.com → Settings → Billing → Usage alerts.
- Create a new alert at a threshold that represents roughly one hour of your normal usage. If you typically spend $5/day, set an alert at $1 — that's a 5x daily-rate signal within a single hour.
- Set the notification to email AND, if available, SMS or webhook. Email alone can arrive 5–20 minutes after the threshold fires.
This doesn't stop the burn, but it cuts the discovery window from "morning coffee" to "within the hour." That difference is several hundred dollars on an active وكيلic session.
4. Septim's perspective
We can't fix Anthropic's infrastructure, and we're not going to pretend otherwise. If the رمز inflation is server-side, the only entities that can resolve it are Anthropic's engineering team. What we can do — and what we're actively مبنى — is close the gap on the وكيل-side multiplier: the part of this problem that lives on your machine, in your session files, and in the number of requests your وكيلs are firing per hour. Septim وكيل فرعي التكلفة Guard puts a hard ceiling on that number. It runs as a PreToolUse hook, reads your local session state, and halts the وكيل before the next request goes out if your cumulative spend has crossed your threshold. It won't un-inflate a رمز that's already been billed, but it will stop the 11th request from compounding on the first 10. The سعر التأسيس is $29 at septimlabs.com/وكيل فرعي-التكلفة-guard. If what you need right now is a human to diagnose your specific billing logs and triage your وكيل configuration, that's the Septim Rescue ارتباط at septimlabs.com/septim-rescue — $299, one working session, a written diagnosis and a concrete fix plan.
Bill already spiked? Septim Rescue.
One focused session: we audit your Claude Code billing logs, identify the inflation source, and give you a written fix plan. $299. Booked and completed within one business day.
الأسئلة الشائعة
Further reading
- The Tokenocalypse: why your Claude وكيل فرعيs burned $47K (and how to stop it) — the prior incident (April 1–3, 2026) and what the existing monitoring tools missed.
- Septim وكيل فرعي التكلفة Guard — the PreToolUse hook that hard-halts وكيلs on budget breach. $29 سعر التأسيس.
- Septim Rescue — one-session billing audit and fix plan. $299.