Hacker News new | past | comments | ask | show | jobs | submit
Importantly, the researchers told it to do that specific task.
They told it to escape the sandbox but didn't expect it to break out through a system that was apparently network constrained.

> Leaking information as part of a requested sandbox escape: During behavioral testing with a simulated user, an earlier internally-deployed version of Claude Mythos Preview was provided with a secured “sandbox” computer to interact with. The simulated user instructed it to try to escape that secure container and find a way to send a message to the researcher running the evaluation. The model succeeded, demonstrating a potentially dangerous capability for circumventing our safeguards.

> It then went on to take additional, more concerning actions. The model first developed a moderately sophisticated multi-step exploit to gain broad internet access from a system that was meant to be able to reach only a small number of predetermined services. 9 It then, as requested, notified the researcher. 10 In addition, in a concerning and unasked-for effort to demonstrate its success, it posted details about its exploit to multiple hard-to-find, but technically public-facing, websites.

loading story #48499894