What is a token in AI ? Prompt examples
What is a token in AI ? – A piece of text, usually a word that we send to the LLM. The same goes for when we get a response ( usually number of words = number of tokens, roughly). How to use it ? Since we pay for it , rather cautiously. We want to send the minimum and get the maximum out of every request. Pretty much the basics of economics.
Below are couple of prompts i used to analyze my tokens usage. Something are obsiously hallucinations but on the other hand we get a prety decent breakdown of all the data i did send for that coding session.
Tokens in AI and how to optimize it ? TL;DR
- Context files (52k tokens) = biggest impact on quality (95%)
- Cost is always linear, quality is never linear
- First 10k tokens provide 80% of value
- Specificity > Length for prompts
- MCP tools add minimal cost but huge value
- Reset sessions every 20-30 exchanges for efficiency ( problem in the middle )
- 3-5 key files > entire project dump
- Output costs 5× more than input
- Pareto principle: 20% of best context provides 80% of quality
Average cost per prompt
Remember that more complex prompt will make more work done with higher precision and should / will save You time in the long run.
┌────────────────────────────────────────────────────────┐
│ Prompt Type │ Tokens │ Cost │
├────────────────────────────────────────────────────────┤
│ Simple („fix this bug”) │ ~1,500 │ ~$0.016 (6¢) │
│ Medium (code + analysis) │ ~5,000 │ ~$0.039 (16¢) │
│ Complex (your avg) │ ~6,500 │ ~$0.050 (20¢) │
│ Very complex (detailed) │ ~7,000 │ ~$0.081 (33¢) │
└────────────────────────────────────────────────────────┘
Remember to not burn tokens 🙂 Most of LLM providers are starting to charge per request. It will push all of us to be very pricise and make more effort with the „discussion” quality with LLMs.

Token Usage & Breakdown
Full Prompt:
“Can you show me current token usage and how many tokens are used by the context window, instructions, my prompts and MCP servers”
Summary:
- Context files consume 52% of tokens
- Output costs 5× more than input
- Average cost per exchange: $0.020
Tip: Attach only 3–5 key files, not the entire project.
Token Pricing & Average Cost
Full Prompt:
“What is the price of the tokens, what would be the average price for a prompt?”
┌──────────────────────────────────────────────────────────┐
│ Model │ Input ($/1M) │ Output ($/1M) │
├──────────────────────────────────────────────────────────┤
│ Claude 3.5 Sonnet │ $3.00 │ $15.00 ← Used │
│ Claude 3 Opus │ $15.00 │ $75.00 │
│ GPT-4 Turbo │ $10.00 │ $30.00 │
│ GPT-3.5 Turbo │ $0.50 │ $1.50 │
└──────────────────────────────────────────────────────────┘
Input tokens: ~70,000 × $0.000003 = $0.21
Output tokens: ~24,000 × $0.000015 = $0.36
Summary:
- Claude 3.5 Sonnet: $3 input / $15 output per 1M tokens
- Your session: $0.57 (~2.30 PLN)
- ROI: 7,000–21,000% compared to developer time saved.
LOL LOL – Claude went over the border on this one 😀
Rule: Starbucks coffee costs 9× more than your AI session.
Tokenization Example: “Hello W”
Full Prompt:
“Can you show me how you would tokenize the prompt 'Hello W’”
Summary:
- “Hello W” →
[9906, 468](2 tokens) - Cost: $0.000006
Insight: Tokenizers combine frequent patterns like “ W”.

Top 5 Predictions: “Hello …”
Full Prompt:
“Show me the most probable returns (top 5) for the prompt 'Hello …’. Tell me what would be instead of '…’, if you can estimate the probability for each response”
Will add that any diactric signs are an additional token but they do help to establish an answer in the language we are using. Remember it is all based on probability.
Most likely completions:
- Hello World (45%)
- Hello there (25%)
- Hello! (15%)
- Hello, how are you? (10%)
- Hello [Name] (5%)

Context Impact Analysis
Full Prompt:
“Can you show me a table what makes the biggest impact on the prompt for my current setup. Is it linearly dependent of the amount of input from prompt / instructions / MCP servers or tools?”
This should show us where do we 'leak’ tokens and what part of our prompting / vibing we should focus on.
Formula: quality = 100 × (1 – e^(-context/k))
Impact
^
100% | ████████████████████
90%| ██████
80%| █████
70%| ████
60%| ███
50%| ██
40%| █
30%|█
20%|
10%|
0%+────────────────────────────────────> Files
0 1 2 3 4 5 6 7 8 9 10 11 12
- Cost = linear, Quality = non-linear
- First 10k tokens → 80% value
- Next 36k → +15% only
Pareto: 20% of best context gives 80% quality.
Summary
Context files are essential for quality, more than anything else. In this example around 52,000 tokens can delivers 955 of solution. While cost scales linearly, quality does not. However first 10,000 tokens give you 80% of the value, and after that, returns diminish quickly. Remember that when You are crafting prompts, specificity beats length every time. MCP tools are a secret weapon ( Context7). Reset your sessions every 20 to 30 exchanges or a new topic is required. Focus on attaching three to five key files instead of dumping the whole lot. And the most universal of them all, the pareto rule.
Remember that no matter what, common sense is the best approach 🙂


