Story Detail of id 48501745 | Liveview Hacker News

wraptile9 hours ago | on: Claude Fable is relentlessly proactive

It feels like Fable is slightly smarter but overall worse tool exactly due to this.

It's constantly turning what should be 50 LOC patch of a single prompt into 30 minute exploration that is totally not worth it. Often wrong even.

I trialed it on some rather simple stuff - backfill redis dedupe cache when the hash function changed: instead of running new hash func on every db value to expand the cache it implemented some overly-complex cache update that tried to guess hashing func version of each cached value and recalculate only the old hashes. I can imagine in some context this would make sense maybe? but not 30 minutes of token burn that got replaced by 10 lines for loop by me.

I fear that this is generally bad news for programming. LLM tech is clearly running into a diminishing returns wall on intelligence but a response to that is to just make them more relentless which is a pretty poor solution for everyone involved, except I guess people who sell the tokens and people who can afford these tokens to scan for 0-days.

bwfan1233 hours ago | parent | next

> but a response to that is to just make them more relentless which is a pretty poor solution for everyone involved

I see two problems with LLMs & agents which wont be fixed possibly forever.

1) They dont have causal models. What they can do only is trial-and-error exploration which works quite well for many problems. But many other problems require a causal model.

2) Prompts lack precision, and programming languages and machine models were invented to solve this problem. English is great, but it is not a programming language.

eijew7 hours ago | parent | next

I actually think internally they knew they hit diminishing returns awhile ago.

They’ve been doing a lot of strategic introduction and manipulation in the run up to the IPO, and it’s worked in that regard.

mexicocitinluez6 hours ago | parent

The other day I was doing something that required CC to update like 15-20 files in exactly the same way (hoist a specific function out of the component body) and instead of just updating the files, it spun up multiple agents, one of which wrote a perl script to hunt down all the files, do some regex, and replace all occurrences. And then instead of just running tsc to check for errors, it wrote a script to run tsc in each of the subagents and combine the results.

It was actually pretty maddening as what should have taken a minute or two tops took like 10 because it went down this route.

I'm gonna try something much more complex later, but for simple things, it felt like driving a corvette to the mailbox.

#visit	13,784,686
#session	74,665
#live-session	0