Story Detail of id 47349897 | Liveview Hacker News

sunaurus10 hours ago | on: Are LLM merge rates not getting better?

I am pretty convinced that for most types of day to day work, any perceived improvements from the latest Claude models for example were total placebo. In blind tests and with normal tasks, people would probably have no idea if they're using Opus 4.5 or 4.6.

BoumTAC8 hours ago | parent | next

It's because they are getting so good it's impossible to recognize them.

Haiku 4.5 is already so good it's ok for 80% (95%?) of dev tasks.

AussieWog9310 hours ago | parent

I'd agree with you on 4.5 to 4.6, but going from gpt-5 or 4.0 to 4.5 was night and day.

loading story #47357226

loading story #47350454

#visit	13,080,535
#session	74,665
#live-session	0