Hacker News new | past | comments | ask | show | jobs | submit
I am pretty convinced that for most types of day to day work, any perceived improvements from the latest Claude models for example were total placebo. In blind tests and with normal tasks, people would probably have no idea if they're using Opus 4.5 or 4.6.
It's because they are getting so good it's impossible to recognize them.

Haiku 4.5 is already so good it's ok for 80% (95%?) of dev tasks.

I'd agree with you on 4.5 to 4.6, but going from gpt-5 or 4.0 to 4.5 was night and day.
loading story #47357226
loading story #47350454