Hacker News new | past | comments | ask | show | jobs | submit
Opus 4.8 (or a classifier in front of it) flagged my account and refused to comply when I told it to kill the process. Reasoning summary was complete bananas.

With this in mind, I don't want model to be proactively instructed and encouraged to sabotage without telling me.

Same here when I said to “nuke” a process.