Context pruning strategies — CCA-F Exam Prep

L2.27|Context pruning strategies

1/12

Comparison

Split screen. Left: a messy desk with papers piled everywhere, some falling off the edge. Label: 'No pruning.' Right: a clean desk with a small organized stack, a filing cabinet behind it. Label: 'Pruned.' Both desks have the same amount of work to reference.

Two customer support agents. Same product. Same context window. Wildly different results.

Agent A keeps every turn verbatim. By turn 30, context is 90K tokens. The model is slow, expensive, and starts contradicting itself -- it can't find the relevant information buried in 30 turns of small talk.

Agent B summarizes old turns and keeps the last 5 verbatim. At turn 30, context is 15K tokens. Fast, cheap, and the model stays focused because the signal-to-noise ratio is high.

Same information. One-sixth the tokens. Better results. The difference is pruning.