Tokens: The Currency of AI
Every interaction you have with an AI model — every question you ask, every document you paste, every answer you receive — is measured in tokens. If you want to understand AI costs, speed, and limitations, you need to understand tokens. They are the fundamental unit of everything.
What Exactly Is a Token?
A token is a chunk of text that the AI model processes as a single unit. Tokens are not the same as words. They are not the same as characters. They are sub-word units created by a process called tokenization, which breaks text into pieces that balance vocabulary size with efficiency.
Here is how tokenization actually works with real examples:
- "cat" = 1 token (common short words are single tokens)
- "hello" = 1 token
- "unconstitutional" = 4 tokens ("un" + "const" + "itution" + "al")
- "AI" = 1 token
- "artificial intelligence" = 2 tokens
- "New York City" = 3 tokens
- "123456789" = 3-4 tokens (numbers are tokenized in chunks)
- Emojis like "😀" = 1-3 tokens depending on the emoji
- Code like
function calculateTotal(items) { = roughly 7-8 tokens
The key insight: common words and word fragments get their own token, while rare or long words get split into multiple tokens. This is why "the" is always 1 token but "pneumonoultramicroscopicsilicovolcanoconiosis" might be 12+ tokens.
Token Counting Rules of Thumb
You do not need a calculator. Use this table for quick estimates:
| Content Type | Approximate Token Count |
|---|
| 1 English word | ~1.3 tokens (on average) |
| 1 page of text (~500 words) | ~650-700 tokens |
| A short email (100 words) | ~130 tokens |
| A one-page memo (400 words) | ~520 tokens |
| A 10-page report | ~6,500-7,000 tokens |
| A full-length novel (80,000 words) | ~100,000 tokens |
| 1 line of Python code | ~10-15 tokens |
| 100 lines of code | ~1,000-1,500 tokens |
| 1,000 tokens | ~750 words |
| 1 token | ~4 characters (in English) |
Important caveats:
- Non-English languages typically use more tokens per word. Chinese, Japanese, and Korean can use 2-3x more tokens for the same meaning.
- Code tends to use more tokens per "concept" than prose because of syntax characters.
- Structured data (JSON, XML) is token-heavy due to brackets, keys, and formatting characters.
- Whitespace and punctuation consume tokens too.
Why Tokens Matter: The Three Costs
Cost #1: Money. You pay per token — both input tokens (what you send) and output tokens (what the model generates). Output tokens are typically 3-5x more expensive than input tokens. A typical GPT-4 API call processing a 2,000-word document and generating a 500-word response costs roughly $0.05-0.15 depending on the model. That sounds small, but at scale — say 10,000 calls per day — it adds up to $500-1,500 daily.
Cost #2: Speed. More tokens = slower responses. The model generates tokens one at a time (even if it displays them in chunks). A 100-token response takes about 1-2 seconds. A 2,000-token response takes 15-30 seconds. If you ask for a 5,000-word essay, you will wait.
Cost #3: Limits. Every model has a maximum context window (more on this below). Tokens consumed by your input reduce the space available for the model's output. If you paste a 50-page document into a model with a 4K token limit, it literally cannot process it.