Every token in your prompt costs money and consumes context window space. These 10 techniques can cut your token usage by 30–60% without losing any meaning or degrading output quality.
1. Remove Filler Phrases
Conversational padding adds tokens without adding information.
Before (18 tokens): "I would like you to please help me write a summary of the following text"
After (7 tokens): "Summarize this text:"
2. Use Abbreviations in System Prompts
Models understand shorthand in instructions. You don't need full sentences for every rule.
Before: "You should always respond in JSON format and make sure to include all required fields"
After: "Respond in JSON. Include all required fields."
3. Replace Examples with Patterns
Instead of showing 5 examples of the same pattern, show 1 example and describe the pattern.
Before (5 examples): ~150 tokens showing name/email/phone formatting
After (1 example + rule): ~50 tokens with "Format: {field}: {value}, one per line. Example: Name: John Doe"
4. Compress Repeated Structures
If you're passing multiple items with the same structure, define the schema once and pass data compactly.
Before: "The first product is called Widget, it costs $10, and it's in the Tools category. The second product is called Gadget, it costs $25..."
After: "Products (name|price|category): Widget|$10|Tools, Gadget|$25|Electronics"
5. Use Structured Formats for Data
CSV or pipe-delimited data uses far fewer tokens than prose or even JSON.
JSON (45 tokens):
{"users": [{"name": "Alice", "role": "admin"}, {"name": "Bob", "role": "user"}]}
CSV (20 tokens):
name,role
Alice,admin
Bob,user
6. Trim Whitespace and Formatting
Extra blank lines, indentation, and decorative formatting all consume tokens. A line of dashes "----------" costs tokens for zero information.
7. Reference Instead of Repeat
If the same instruction appears in multiple places, define it once and reference it.
Before: Repeating "respond in JSON with fields: id, name, status" in 4 different places
After: Define once as "Format A" at the top, then say "Use Format A" elsewhere
8. Use max_tokens to Limit Output
Setting max_tokens in your API call doesn't reduce input tokens, but it prevents the model from generating unnecessarily long responses. Output tokens are 2–4x more expensive than input tokens at most providers.
9. Strip Irrelevant Context
When passing documents for analysis, remove boilerplate that doesn't affect the answer: headers, footers, navigation text, legal disclaimers, and repeated metadata.
A web page scraped with all its HTML might use 5,000 tokens. The actual content might be 800 tokens. Extract the relevant text before including it in your prompt.
10. Use Shorter Variable Names in Templates
If you're using prompt templates with placeholders, shorter keys save tokens across every request.
Before: "The customer_full_name is {customer_full_name} and their email_address is {email_address}"
After: "Name: {name}, Email: {email}"
Measuring Your Savings
After applying these techniques, measure the difference. A prompt that went from 2,000 tokens to 800 tokens saves 60% on every request. At 100,000 requests per day with GPT-4o input pricing ($2.50/M tokens), that's a savings of $0.30/day or about $9/month — and that's just one prompt. Multiply across all your prompts and the savings compound quickly.
The best optimization is the one you measure. Count your tokens before and after every change.