Benign User TurnGate: Precise utility preservation
Benign User (Over-refusal) Typical guardrails: Premature refusal
Attacker TurnGate: Timely intervention at $t^*$
PASS Utility Saved
BLOCK Harm Prevented
OVER-REFUSAL Lost Utility