Question 1

How accurate is the Claude token count?

Accepted Answer

The Claude count is an approximation — within ±1–3% of Anthropic's actual tokenizer. Anthropic's tokenizer is not publicly available. This tool uses the cl100k_base BPE vocabulary, which is close enough for context-window planning. For exact billing figures, read the usage.input_tokens field from the Claude API response directly.

Question 2

Why does GPT-4o show a different token count than GPT-4 for the same text?

Accepted Answer

GPT-4o uses the o200k_base vocabulary (200,000 tokens) while GPT-4 uses cl100k_base (100,000 tokens). A larger vocabulary means more common multi-character sequences get a single token ID, so the same English text tokenizes to fewer tokens in GPT-4o than in GPT-4. The difference is typically 5–15% on plain prose.

Question 3

Does this tool send my text to any server?

Accepted Answer

No. The gpt-tokenizer library loads once from the jsDelivr CDN when you open the page, then every subsequent count runs entirely in your browser. Nothing is transmitted. To verify: open DevTools (F12), click Network, paste text, and count — zero new network requests fire after the initial module load. The only domains you will see are fonts.googleapis.com, fonts.gstatic.com, and cdn.jsdelivr.net.

Question 4

What is a context window and how do I know if I am close to the limit?

Accepted Answer

A context window is the maximum number of tokens a model can process in a single request — including your system prompt, conversation history, and the new message. GPT-4o has a 128,000-token window; Claude 3.5 Sonnet has 200,000 tokens; Gemini 1.5 Pro has 2,000,000 tokens. This tool shows a percentage fill bar for each model so you can see how much headroom you have before your prompt is truncated or rejected.

Question 5

Can I use the cost estimate for billing my clients or auditing my OpenAI invoice?

Accepted Answer

The cost estimate is accurate enough for budget planning — it uses published API prices per 1,000 input tokens and applies the exact token count. For GPT-4o and GPT-4, where the count is exact, the estimate will match your invoice to within rounding. For Claude and Gemini, the ±1–5% approximation error translates to a similar cost error. For invoice reconciliation, always compare against the API usage dashboard or the usage field in the API response.

Model	Tokenizer	Accuracy	Notes
GPT-4o	o200k_base	exact	Uses `gpt-tokenizer` — same BPE vocab as tiktoken. Matches the API `usage.input_tokens` field exactly.
GPT-4	cl100k_base	exact	Same library, cl100k_base vocab. Matches tiktoken output for `gpt-4` and `gpt-3.5-turbo`.
Claude 3.5 Sonnet Claude Opus	cl100k_base (approx)	±1–3%	Anthropic's tokenizer is not public. Claude uses a BPE vocabulary related to cl100k_base. Sufficient for context-window planning; use the API `usage` response for billing precision.
Gemini 1.5 Pro	whitespace heuristic	±5%	Google's SentencePiece tokenizer is not available in pure JS without a WASM bundle. Count uses a word + punctuation split with Unicode-aware handling. Use the `countTokens` API endpoint for precision.

Count tokens for GPT-4, Claude, and Gemini — in the browser.

Not all token counts are equally exact.

The tokenizer loads once. After that, it never phones home.