Is Claude बिगड़ रहा है? क्या we learned when Anthropic admitted it.
For six हफ़्ते, developers said Claude got dumber. Anthropic said it didn't. On April 23, Anthropic published an internal post-mortem confirming three engineering changes degraded coding output between March 4 और April 16, तो quietly pushed fixes और reset everyone's usage limits as a concession. Here is what changed, who got hit, और what to do when your business चलती है on one AI vendor.
क्या developers were saying
Starting in early March, threads on Hacker नयाs और r/ClaudeAI began compiling वही observation: Claude Code was producing shorter answers, missing obvious steps, और abandoning context mid-task. Some users blamed their own prompt drift. Others insisted the model itself had quietly changed.
Anthropic's public stance for the अगला six हफ़्ते was that no model swap had occurred. The status page showed green. Internal benchmarks reportedly came back flat. The community kept producing receipts — side-by-side outputs, identical prompts, demonstrably worse responses — और Anthropic kept saying nothing was wrong.
The status page is a function of what you measure. If your benchmark is API uptime, और the regression is in the shape of the output, you can be honestly green और quietly broken at वही time.— Septim Labs · April 25, 2026
क्या दरअसल changed
The April 23 post-mortem identified three independent changes that compounded over six हफ़्ते. None was a model retrain. All three were operational adjustments that lived above the model layer.
Default reasoning budget for Claude Code dropped from high to medium as a cost-control measure. On simple tasks, indistinguishable. On multi-step coding work, the model started skipping intermediate planning steps और going straight to output.
A bug in the session-state handler cleared mid-session thinking history at certain context-window thresholds. The model would lose its own work-in-progress reasoning halfway through a task और have to reconstruct it from output history अकेले — visibly worse on long PRs.
A latency-improvement feature added an aggressive truncation rule — a cap of roughly 25 words on a category of intermediate responses Claude इस्तेमाल करता है internally to plan. Output quality on synthesis tasks collapsed क्योंकि the model ran out of room to सोचना before answering.
Each one was a reasonable engineering tradeoff. Together they produced a six-week regression that Anthropic could नहीं detect on its own benchmarks क्योंकि हर change individually fell within the noise floor.
क्या Anthropic did next
The fix landed in three rolling deploys between April 16 और April 23. Reasoning effort returned to high by default. The session-state bug was patched. The 25-word cap was removed. As a goodwill move, Anthropic reset सभी subscriber usage limits for the billing महीना — हर Pro और Max user got their May allotments back, plus April's. No one asked for this; it was a concession.
"Three changes shipped between March 4 और March 22 collectively degraded Claude Code output quality on multi-step tasks. We did नहीं catch it क्योंकि हर change passed our individual rollout gates. Our aggregate quality benchmark did नहीं flag the trend until a community-published reproducer surfaced on April 16."
The exact text is mirrored in the April 24 Fortune piece और the April 22 Register write-up, both linked in sources.
क्यों this matters अगर Claude is your only AI vendor
If your stack चलती है entirely through Anthropic — Claude Code in your editor, Claude API in your product, the Anthropic dashboard for billing — six हफ़्ते of degradation is six हफ़्ते of degradation. Every PR your reviewer agent looked at. Every customer-facing reply your assistant generated. Every test your debug agent diagnosed.
The community caught it. The reproducer that broke the silence was a community artifact, नहीं Anthropic's internal QA. That is the structural lesson here — और it generalizes to हर vendor, नहीं बस Anthropic. कब your business depends on one model behaving consistently, you ज़रूरत वही posture you would have toward any infrastructure dependency: independent verification, version-pinned reproducibility, an exit ramp.
Three concrete things we changed at Septim Labs after April 23
- Pinned a regression suite of our own tasks. Twenty repeatable Claude Code prompts whose outputs we sample हफ़्ताly और diff against आख़िरी quarter's baseline. The reproducer doesn't ज़रूरत to be impressive — it चाहिए to be ours, और we ज़रूरत to look at it.
- Routed cost-sensitive paths through smaller models. Sub-tasks that don't ज़रूरत full Claude reasoning (formatting, classification, schema validation) now चलती हैं on cheaper Anthropic tiers या alternate vendors. कब the flagship model regresses, सिर्फ़ the workload that दरअसल चाहिए it is affected.
- Captured हर output we ship to a customer. Versioned, dated, replayable. If a customer reports the system "got worse" three हफ़्ते later, we have receipts.
The bigger pattern
The tools you ship to customers should नहीं depend on a single vendor's promise that their model is unchanged. The model will change. The vendor will ship operational tweaks above the model layer that they consider safe और that you experience as quality drift. The status page will stay green.
यह है नहीं a takedown of Anthropic specifically. They published a post-mortem, fixed the issue, refunded usage. That is the responsible-vendor pattern. The lesson is that even responsible vendors produce drift, और the सिर्फ़ durable defense is your own measurement.
If your customer can tell, your provider's benchmark didn't.— internal note · written during the regression, April 14, 2026
क्या to do this हफ़्ता
- Run a regression diff. Take ten of your आख़िरी quarter's Claude Code outputs. Re-run वही prompts on the current model. If today's outputs are visibly thinner, you have your own evidence.
- Cap usage on the cost-sensitive paths. Even a soft per-call budget catches runaway reasoning loops before billing दिन.
- Save करें outputs. The cheapest tier on Anthropic अभी भी costs more than disk. Log it.
- Have one parallel vendor. Not for fail-over — for sanity-check. कब Claude shifts, you चाहिए to पता whether GPT या Gemini noticed too. If your tasks regress on Claude लेकिन नहीं on the others, that is signal.
Cap your Claude bill before the अगला time something quietly breaks.
Septim Rescue is a $299 emergency intervention for Claude bills that पहले से spiked. Septim Vault is the $89 lifetime kit we इस्तेमाल to harden our own MCP और agent setups against runaway costs. Both are pay-once. Both are yours forever.