The library

60 ways to spend fewer tokens

The 22 Beginner tips are free to read. The 38 advanced tactics unlock with Pro — plus a fresh tip in your inbox every morning.

All ⚙️ Batching & Automation (7) 💻 Coding Assistants (7) 🧠 Context Management (7) 📊 Measurement & Budgeting (7) 🎚️ Model Selection (4) 📐 Output Control (7) ♻️ Prompt Caching & Reuse (7) ✍️ Prompt Engineering (7) 🔎 Retrieval & RAG (7)

🎚️Model Selection 60-80% on routed traffic

Stop Paying Frontier Prices for Boilerplate Work

Most of your token spend is on tasks a small model handles perfectly. Match the model to the job instead of defaulting to your most expensive option for everything.

Beginner 1 min Read →

🎚️Model Selection 🔒 Pro

Cascade: Try the Cheap Model First, Escalate Only When It Fails

Send every request to a small model first, programmatically check the answer, and only escalate to a frontier model when the cheap one falls short.

Intermediate 2 min Unlock →

🎚️Model Selection 🔒 Pro

Don't Burn Reasoning Tokens on Tasks That Don't Reason

Model selection isn't just which model — it's which reasoning mode. Turn thinking down or off for straightforward work and reserve deep reasoning for genuinely hard problems.

Intermediate 2 min Unlock →

🎚️Model Selection 🔒 Pro

Use a Big Model as the Planner, Small Models as the Workers

In agentic and multi-step pipelines, reserve the frontier model for orchestration and hard reasoning, and delegate bulk subtasks (search, read, extract) to a cheaper model.

Advanced 2 min Unlock →

Like what you see?

Get a fresh one in your inbox — weekly free, daily on Pro.

Subscribe free Go Pro — $9/mo