Safety and escalation scenarios — CCA-F Exam Prep
An agent got stuck in a loop and burned $400 in tokens before anyone noticed. Another agent helpfully explained how to exploit a security vulnerability. A third spent $2,400 on a single research query.
Building an agent that works is the easy part. Building one that fails safely is what separates passing scores from failing ones on the CCA-F. Five failure modes. Five defenses. Each one is a locked door you forgot to install.