AI Token Counter

Estimate token counts and API costs for GPT-4o, Claude, Gemini and more. Paste text, see results instantly. 100% client-side.

Input Text
0
Characters
0
Words
0
Lines
Token Estimates
GPT-4o OpenAI
Enter text to see estimate
GPT-4o Mini OpenAI
Enter text to see estimate
Claude Sonnet 4 Anthropic
Enter text to see estimate
Claude Opus 4 Anthropic
Enter text to see estimate
Gemini 1.5 Pro Google
Enter text to see estimate
Gemini 1.5 Flash Google
Enter text to see estimate
Pricing Reference per 1M tokens
ModelInputOutput
GPT-4o$2.50$10.00
GPT-4o Mini$0.150$0.60
Claude Sonnet 4$3.00$15.00
Claude Opus 4$15.00$75.00
Gemini 1.5 Pro$1.25$5.00
Gemini 1.5 Flash$0.075$0.30

How it works: Token counts are estimated using a hybrid of word-based (~1.3 tokens/word) and character-based (~4 chars/token) heuristics, averaged for accuracy. Actual token counts vary by model tokenizer. For exact counts, use each provider's official tokenizer API. All processing runs in your browser — no data is sent anywhere.

New tools every week

Get notified. No spam.

How to Count Tokens for LLM API Calls

Every LLM API call is billed by tokens — sub-word units that determine both your cost and whether your prompt fits within the model's context window. This AI token counter estimates token counts for GPT-4o, Claude, Gemini, and other popular models, plus shows the estimated API cost in real time.

Token counts vary by model because each uses a different tokenizer. GPT-4o uses cl100k_base, Claude uses its own BPE tokenizer, and Gemini uses SentencePiece. This tool uses calibrated heuristics (word-based and character-based) to estimate tokens for each model family without requiring their actual tokenizer libraries — giving you a fast, privacy-preserving estimate that runs entirely in your browser.

Paste your prompt, system message, or expected output to see how many tokens it consumes and what it'll cost. This is especially useful for budgeting API calls, staying within context window limits, and comparing cost across providers before committing to a model.

Tips

  • Token counts are estimates with ~10% variance. For exact counts, use the model provider's official tokenizer — but this tool is faster for quick budgeting.
  • Input tokens are always cheaper than output tokens. If your use case generates long responses, focus on optimizing output length to control costs.
  • Context window limits are in total tokens (input + output combined). Leave headroom for the model's response when designing prompts.
  • Gemini 1.5 Flash is often 10-40x cheaper than GPT-4o or Claude Opus for simple tasks — use the cost comparison to pick the right model for each job.