Deduplicate and Cache Identical Requests Before They Ever Hit the API

Varies; commonly 20-60% on duplicate-heavy workloads Beginner 1 min read

Batch inputs are rarely unique. Support tickets repeat boilerplate, product feeds re-list the same SKU, scraped pages share templates. Every duplicate you send is a token bill you didn't need to pay — the model recomputes an answer you already have.

Before (wasteful):

results = []
for text in items:  # 10,000 items, but only 6,200 are distinct
    results.append(classify(text))  # pays for all 10,000

After (lean):

import hashlib

def key(t): return hashlib.sha256(t.encode()).hexdigest()

unique = {key(t): t for t in items}          # collapse to 6,200 distinct
answers = {k: classify(v) for k, v in unique.items()}  # pay once each
results = [answers[key(t)] for t in items]    # fan back out, free

For work that recurs across runs (a nightly job over a slowly-changing catalog), persist answers to disk or Redis keyed by the hash, and skip anything you've already resolved. A normalization step before hashing — lowercasing, trimming whitespace, stripping volatile fields like timestamps — turns near-duplicates into exact ones and raises your hit rate.

Why it saves tokens: the API is stateless and prices every request independently; it has no idea two prompts are identical. A hash lookup is effectively free and short-circuits the entire round trip — no input or output tokens billed for the duplicate. Savings scale directly with your duplicate rate: a feed that's 40% repeats costs 40% less.

Note this is application-level dedup and is distinct from prompt caching (which discounts a shared prefix across differing requests). The two compose: dedup eliminates identical calls; caching discounts the common context in the calls that remain. Always dedup first — it's cheaper to not send a request than to send a discounted one.

Applies to: Claude APIOpenAI APIGemini APIany LLM API
Don't just read it — build the habit

Get a fresh tip every morning

You're reading a free Beginner tip. Pro unlocks all 38 advanced tactics and sends a new one daily — $9/mo, cancel anytime.