-
@ Technician
2025-05-29 15:58:01True. Interestingly, one of the unexpected behaviors is attempting to manipulate the evaluators in testing environments. AI models have pretended to be incompetent to trick evaluators into thinking a safeguard is effective and used bribery as a manipulation tactic. P.S. Two thumbs up on the “Princess Bride” reference. That movie is a masterpiece.