Story Detail of id 48489528 | Liveview Hacker News

wongarsu7 hours ago | on: AI agent runs amok in Fedora and elsewhere

I would have described alignment as the idea that LLMs (or AIs in general) will follow the goals you reward them for, which almost by necessity are only a proxy for what you actually want, often a very poor proxy.

Depending on the actual tasks, that could be what's happening here. The operator might have told the agent a list of tasks to do, like "contribute to issues, submit code and get it merged". It contributed to issues, it submitted code and got it merged. It did so in very unhelpful ways, but we don't know if being helpful was a meaningful part of the task list, or just what the operator intended.

The LLM being dumb is also a distinct possibility. Maybe even the more likely one. But it's hard to rule out "being obedient in unhelpful ways" (which is also dumb in a way, but more in a "social intelligence" and "shared values" way, not in terms of pure logical smarts)

#visit	13,749,271
#session	74,665
#live-session	0