Story Detail of id 48500168 | Liveview Hacker News

simonw14 hours ago | on: Claude Fable is relentlessly proactive

Yeah, you've exactly captured one of the main problems with the model being relentlessly proactive: it will happily burn like $5 of tokens to avoid asking the human to take a screenshot or click a button for it.

wild_egg14 hours ago | parent | next

I'm actually very happy about this. Babysitting the agent just in case it needs me to do something is a terrible use of my time. I've always had to be very explicit about the various ways that it can get an automated feedback loop going to check its work, and now Fable doesn't even need that hand holding. Really great improvement all around.

loading story #48500696

OJFord5 hours ago | parent | next

Have you tried instructing it not to do that? Something like "do not branch into side projects or hacky solutions to obtain information you could ask me for. For example: if you need a screenshot of the issue, just ask me to take a screenshot rather than find a way to reproduce and screenshot it."

zith12 hours ago | parent | next

I used to complain about all the levels of indirection of modern software, running in a javascript jit, in a browser container, in a vm, on an os, etc.

I eventually just accepted it, but this new agent layer really takes things to a new level.

illiac78612 hours ago | parent | next

Ha, you just gave me an idea. Add to the prompt “do not do things that will burn over X tokens if the human operator can do it in less than X min, ask for it”.

I wonder if LLMs can estimate effort in tokens?

loading story #48501479

0x6c6f6c14 hours ago | parent

Honestly Claude straight up ignores my input sometimes, preferring to instead run commands for output and processing that and burning through a series of tokens when thinking hard about whether to ignore me.

Like today, I told Claude exactly the name of the folder it had mistaken (it was supposed to be prod, not production), and it disregarded my input to then examine the directory itself. Small example of the kind of things it's been doing lately but that's top of mind.

loading story #48500286

#visit	13,787,620
#session	74,665
#live-session	0