Context pruning strategies — CCA-F Exam Prep
L2.27|Context pruning strategies
1/12
Two customer support agents. Same product. Same context window. Wildly different results.
Agent A keeps every turn verbatim. By turn 30, context is 90K tokens. The model is slow, expensive, and starts contradicting itself -- it can't find the relevant information buried in 30 turns of small talk.
Agent B summarizes old turns and keeps the last 5 verbatim. At turn 30, context is 15K tokens. Fast, cheap, and the model stays focused because the signal-to-noise ratio is high.
Same information. One-sixth the tokens. Better results. The difference is pruning.
