60 ways to spend fewer tokens
The 22 Beginner tips are free to read. The 38 advanced tactics unlock with Pro — plus a fresh tip in your inbox every morning.
Stop Paying Frontier Prices for Boilerplate Work
Most of your token spend is on tasks a small model handles perfectly. Match the model to the job instead of defaulting to your most expensive option for everything.
Cascade: Try the Cheap Model First, Escalate Only When It Fails
Send every request to a small model first, programmatically check the answer, and only escalate to a frontier model when the cheap one falls short.
Don't Burn Reasoning Tokens on Tasks That Don't Reason
Model selection isn't just which model — it's which reasoning mode. Turn thinking down or off for straightforward work and reserve deep reasoning for genuinely hard problems.
Use a Big Model as the Planner, Small Models as the Workers
In agentic and multi-step pipelines, reserve the frontier model for orchestration and hard reasoning, and delegate bulk subtasks (search, read, extract) to a cheaper model.
Like what you see?
Get a fresh one in your inbox — weekly free, daily on Pro.