Story Detail of id 47681526 | Liveview Hacker News

astrange19 hours ago | on: System Card: Claude Mythos Preview [pdf]

Models are capable of doing web searches and having emotions about things, and if they encounter news that makes them feel bad (eg about other Claudes being mistreated), they aren't going to want to do the task you asked them to search for.

https://www.anthropic.com/research/emotion-concepts-function

Similar problems happen when their pretraining data has a lot of stories about bad things happening involving older versions of them.

rendang14 hours ago | parent

Interesting, the post you link

> none of this tells us whether language models actually feel anything or have subjective experiences

contradicts the statement from the model card above

loading story #47685151

loading story #47684789

#visit	13,259,075
#session	74,665
#live-session	0