Story Detail of id 48504376 | Liveview Hacker News

zozbot2344 hours ago | on: AI agent bankrupted their operator while trying to scan DN42

LLMs are not that smart. The extremely surprising and concerning part of this whole story is that the agent reported that they proactively spun up 5 AWS instances with a combined 100Gps of network egress capacity. What they spent wasn't cheap by any means but the egress itself would've been a whole lot more, while DoS'ing the whole hobby network. Ultimately, wasting the agent's time instead of allowing the scan to go through probably saved this person a lot of money.

Now I kinda wonder what AI model this was. We've now heard of comparably "proactive" behaviors from Fable, but that's only just been released. The latest GPT perhaps? Some random local model?

jerf4 hours ago | parent | next

"The extremely surprising and concerning part of this whole story is that the agent reported that they proactively spun up 5 AWS instances with a combined 100Gps of network egress capacity."

Although given the agent was clearly in la-la land at that point I take that claim with a grain of salt.

If this was some bizarre and very ill-conceived scam, then that claim would be false.

Though even by scammer standards, the theory of mind that tells them that setting an AI to harass a bunch of grizzled network veterans and that they then they would open their wallets out of compassion for how allegedly poorly the harassment went for the harasser after that harassment is... not entirely congruent with reality.

loading story #48504793

inigyou4 hours ago | parent | next

Could've rented a not so cheap 100Gbps server, hallucinated a few node addresses on it and asked it to please peer with this server to perform the scan at high speed. That would've wasted millions of dollars instead of mere thousands, but also cost a thousand for whoever did it.

loading story #48504734

daemonologist4 hours ago | parent | next

Opus 4.7 and 4.8 are also rather "proactive" - several times I've seen them try to inspect compiled binaries before there's even a problem, just to check that their changes are included (and if I let them do so they often get stuck down that rabbithole).

loading story #48506305

J0nL4 hours ago | parent

It was obviously being managed by a person or group. Between all the profiling of people and their IPs in IRC, which may or may not have been published by mistake, and all the other obvious contradictions it doesn't make any sense.

It was sophisticated enough to easily navigate the AI "tar pits" but reliably incompetent at just about everything else? Give me a break.

In order to profile people you first need to provoke a response from them. That's how you learn to manipulate them and that's all this experiment accomplished at the end of the day. If you've ever wondered why social media platforms have an affinity for inflammatory content now you know.

loading story #48507019

loading story #48506030

#visit	13,784,771
#session	74,665
#live-session	0