Safety and escalation scenarios — CCA-F Exam Prep

×
L3.040
PRODUCTION FAILURES

An agent got stuck in a loop and burned $400 in tokens before anyone noticed. Another agent helpfully explained how to exploit a security vulnerability. A third spent $2,400 on a single research query.

Building an agent that works is the easy part. Building one that fails safely is what separates passing scores from failing ones on the CCA-F. Five failure modes. Five defenses. Each one is a locked door you forgot to install.