Hacker News new | past | comments | ask | show | jobs | submit
That is actually a open problem with current models: whether they will act on self-interest or not. There seems to good evidence that they will. See:

    https://www.anthropic.com/research/agentic-misalignment
which (among other things) documents an experiment in which a current-gen AI model attempted to blackmail someone in order to prevent it from being turned off.
loading story #48405200