freetokencounter.app is a free, browser-based token counter that supports every major large language model: OpenAI's GPT-5.4, GPT-5.4 mini, GPT-5.3, GPT-5.2, GPT-5.1, GPT-5, GPT-5 Pro, GPT-4.1, o4-mini, o3, o3-pro and GPT-4o; Anthropic's Claude Opus 4.7, Sonnet 4.6, Sonnet 4.5 and Haiku 4.5; Google's Gemini 3.1 Pro Preview, 3 Flash Preview, 2.5 Pro and 2.5 Flash; Meta's Llama 4 Maverick, Scout and Behemoth; xAI's Grok 4, Grok 4 Heavy and Grok 4 Fast; Mistral's Medium 3, Magistral and Pixtral Large; DeepSeek V3.1 and R1; Cohere's Command A and Command R+; Alibaba's Qwen3-Max, Qwen3-Coder and QwQ-32B; Moonshot's Kimi K2 Thinking; Perplexity's Sonar Reasoning Pro and Deep Research; and MiniMax M1. Paste any text — a prompt, a code block, JSON, or non-Latin script — and freetokencounter.app shows the token count, characters, words, context-window usage, and cost estimate live, with a colored visualization of how the model splits your text. Nothing uploads to a server: the entire tool runs locally in your browser.
Every model uses a slightly different tokenizer, and most providers don't publish their exact tokenizer for in-browser use. freetokencounter.app combines the publicly documented GPT-2 / cl100k splitting regex with per-model calibration constants tuned against each provider's reference outputs. For English prose the result is accurate within roughly five percent — close enough to plan prompts, fit context windows, and budget API spend. The "Estimate" pill on every count makes the methodology transparent: there is no model where freetokencounter.app pretends to be an exact tokenizer when it isn't.
Tokens are the unit AI models bill, the unit context windows are measured in, and the unit that determines whether a prompt will fit at all. Counting tokens before sending lets you avoid context-overflow errors, predict cost down to the cent, compare model efficiency for the same task, and right-size your prompts. freetokencounter.app shows tokens for every supported model side by side, so you can pick the cheapest model that handles your prompt without truncation.
Yes — completely free, no sign-up, no rate limits, no API key required. Every keystroke is processed locally in your browser using JavaScript that ships with the page. Nothing about your prompt — not the text, not the count, not the model selection — is sent to any server. You can verify this in your browser's Network tab while typing: the only request is the initial page load. freetokencounter.app exists because we believe a token counter for sensitive prompts should never see your data.
For OpenAI models (GPT-5, GPT-4.1, GPT-4o, o-series, GPT-4, GPT-3.5) and Meta Llama models, the count is calibrated against the published reference tokenizers and is typically within ±3% on English text. For Anthropic Claude (Opus 4.7, Sonnet 4.6 and earlier), Google Gemini 2.5, xAI Grok, Cohere Command, Alibaba Qwen, DeepSeek, Mistral, Moonshot Kimi, and MiniMax — providers that don't release a fully open tokenizer — counts are within roughly ±5–8%. Code, JSON, and non-Latin scripts can drift more, generally toward higher token counts than the estimate. For high-stakes production budgeting, always validate against the provider's own count_tokens API; for prompt drafting and quick comparisons, freetokencounter.app is purpose-built.
The table below compares context windows, max output tokens, and per-million-token pricing across major model families. Pricing reflects each provider's official rate cards.
| Model | Provider | Context | Input $/1M | Output $/1M |
|---|---|---|---|---|
| GPT-5.4 | OpenAI | 1.1M | $2.50 | $15.00 |
| GPT-5.4 mini | OpenAI | 1.1M | $0.75 | $4.50 |
| GPT-5.4 nano | OpenAI | 1.1M | $0.20 | $1.25 |
| GPT-5.4 Pro | OpenAI | 1.1M | $30.00 | $180.00 |
| GPT-5.3 Codex | OpenAI | 400K | $1.75 | $14.00 |
| GPT-5.2 | OpenAI | 400K | $0.875 | $7.00 |
| GPT-5.2 Pro | OpenAI | 400K | $10.50 | $84.00 |
| GPT-5.1 | OpenAI | 400K | $0.625 | $5.00 |
| GPT-5 | OpenAI | 400K | $1.25 | $10.00 |
| GPT-5 Pro | OpenAI | 400K | $15.00 | $120.00 |
| GPT-5 mini | OpenAI | 400K | $0.25 | $2.00 |
| GPT-5 nano | OpenAI | 400K | $0.05 | $0.40 |
| GPT-4.1 | OpenAI | 1M | $2.00 | $8.00 |
| o4-mini | OpenAI | 200K | $1.10 | $4.40 |
| o3 | OpenAI | 200K | $2.00 | $8.00 |
| o3-pro | OpenAI | 200K | $20.00 | $80.00 |
| GPT-4o | OpenAI | 128K | $2.50 | $10.00 |
| GPT-4o-mini | OpenAI | 128K | $0.15 | $0.60 |
| o1 | OpenAI | 128K | $15.00 | $60.00 |
| Claude Opus 4.7 | Anthropic | 1M | $15.00 | $75.00 |
| Claude Sonnet 4.6 | Anthropic | 1M | $3.00 | $15.00 |
| Claude Sonnet 4.5 | Anthropic | 1M | $3.00 | $15.00 |
| Claude Haiku 4.5 | Anthropic | 200K | $1.00 | $5.00 |
| Claude Opus 4.1 | Anthropic | 200K | $15.00 | $75.00 |
| Claude Opus 4 | Anthropic | 200K | $15.00 | $75.00 |
| Claude Sonnet 4 | Anthropic | 200K | $3.00 | $15.00 |
| Claude 3.5 Sonnet | Anthropic | 200K | $3.00 | $15.00 |
| Gemini 3.1 Pro Preview | 1M | $2.00 | $12.00 | |
| Gemini 3.1 Flash-Lite | 1M | $0.25 | $1.50 | |
| Gemini 3 Flash Preview | 1M | $0.50 | $3.00 | |
| Gemini 2.5 Pro | 1M | $1.25 | $10.00 | |
| Gemini 2.5 Flash | 1M | $0.30 | $2.50 | |
| Gemini 2.0 Flash | 1M | $0.10 | $0.40 | |
| Gemini 1.5 Pro | 2M | $1.25 | $5.00 | |
| Llama 4 Maverick | Meta | 1M | — | — |
| Llama 4 Scout | Meta | 10M | — | — |
| Llama 4 Behemoth | Meta | 1M | — | — |
| Llama 3.3 70B | Meta | 128K | — | — |
| Llama 3.1 405B | Meta | 128K | — | — |
| Grok 4 | xAI | 256K | $3.00 | $15.00 |
| Grok 4 Heavy | xAI | 256K | $30.00 | $90.00 |
| Grok 4 Fast | xAI | 2M | $0.20 | $0.50 |
| Grok Code Fast 1 | xAI | 256K | $0.20 | $1.50 |
| Grok 3 | xAI | 1M | $2.00 | $10.00 |
| Mistral Medium 3 | Mistral | 128K | $0.40 | $2.00 |
| Mistral Large 2 | Mistral | 128K | $2.00 | $6.00 |
| Magistral Medium | Mistral | 40K | $2.00 | $5.00 |
| Codestral 25.01 | Mistral | 256K | $0.30 | $0.90 |
| DeepSeek V3.1 | DeepSeek | 128K | $0.27 | $1.10 |
| DeepSeek R1 | DeepSeek | 128K | $0.55 | $2.19 |
| Command A | Cohere | 256K | $2.50 | $10.00 |
| Command R+ | Cohere | 128K | $2.50 | $10.00 |
| Command R7B | Cohere | 128K | $0.0375 | $0.15 |
| Qwen3-Max | Alibaba | 256K | $1.60 | $6.40 |
| Qwen3-Coder | Alibaba | 256K | — | — |
| QwQ-32B | Alibaba | 131K | — | — |
| Kimi K2 | Moonshot | 128K | $0.60 | $2.50 |
| Kimi K2 Thinking | Moonshot | 256K | $0.60 | $2.50 |
| Sonar Pro | Perplexity | 200K | $3.00 | $15.00 |
| Sonar Reasoning Pro | Perplexity | 200K | $2.00 | $8.00 |
| MiniMax-M1 | MiniMax | 1M | $0.40 | $2.20 |
Pricing reflects each provider's published rate card. Open-weight Llama models have no first-party API; pricing varies by host (Together, Groq, Fireworks).
A token is the smallest unit of text an AI model processes. Tokens can be whole words, sub-words, or single characters depending on the tokenizer. As a rough rule, 1 token ≈ 4 characters or ¾ of an English word, but the exact count depends on the specific model. freetokencounter.app shows you the count for any major model in real time.
Counts are estimates within roughly ±5% for English text on most models. The tool uses the public GPT-2 / cl100k splitting pattern combined with per-model calibration constants tuned against each provider's published tokenizer. Some providers (Anthropic, Google, MiniMax) do not publish a fully open tokenizer, so counts for those models are clearly labeled as approximations.
freetokencounter.app supports OpenAI (GPT-5.4, 5.4 mini, 5.4 nano, 5.4 Pro, 5.3, 5.2, 5.2 Pro, 5.1, GPT-5, GPT-5 Pro, GPT-4.1, o4-mini, o3, o3-pro, GPT-4o, ChatGPT-4o), Anthropic (Claude Opus 4.7, Sonnet 4.6, Sonnet 4.5, Haiku 4.5, Opus 4.1), Google (Gemini 3.1 Pro Preview, 3.1 Flash-Lite, 3 Flash Preview, 2.5 Pro, 2.5 Flash), Meta (Llama 4 Maverick/Scout/Behemoth, Llama 3.x), xAI (Grok 4, 4 Heavy, 4 Fast, Code Fast 1, Grok 3), Mistral (Medium 3, Small 3.1, Magistral, Pixtral Large, Codestral 25.01), DeepSeek (V3.1, R1), Cohere (Command A, R+, R, R7B), Alibaba (Qwen3-Max, Qwen3-Coder, QwQ-32B), Moonshot (Kimi K2, K2 Thinking), Perplexity (Sonar, Sonar Pro, Sonar Reasoning Pro, Sonar Deep Research), MiniMax (M1, Text-01) — over 100 models across 12 providers.
No. freetokencounter.app runs entirely in your browser. Your prompt text never leaves your device — there are no servers, no analytics on input, no logging. You can verify this by checking the Network tab in your browser's developer tools while typing.
Each model family uses a different tokenizer trained on different data with a different vocabulary. GPT-5 and GPT-4o use o200k_base (~200K vocab), Claude Opus 4.7 uses Anthropic's proprietary tokenizer, Gemini 2.5 Pro uses SentencePiece, Llama 4 uses its own BPE, and Grok 4 uses xAI's tokenizer. The same word may be one token in one model and three tokens in another. freetokencounter.app shows the difference side by side.
The cost estimate multiplies your input token count by each model's published per-million input rate, plus an estimated output token count by the per-million output rate. Pricing data is sourced from each provider's official pricing page and updated periodically. Always check live pricing for production budgeting.
The context window is the maximum number of tokens (input + output combined) a model can process in a single request. GPT-5 handles 400,000 tokens, Claude Opus 4.7 up to 1 million, Gemini 2.5 Pro 1 million, Llama 4 Scout up to 10 million. If your prompt plus expected output exceeds the context window, the model will reject the request or truncate. freetokencounter.app shows a live context-window meter for the selected model.
Yes. freetokencounter.app handles code, JSON, markdown, and any Unicode text including non-Latin scripts. Note that code and non-English text typically use 30–80% more tokens than equivalent English prose, because their characters fall outside the most common subword merges. The tool shows accurate counts for any text you paste.