Story Detail of id 47349650 | Liveview Hacker News

mike_hearn11 hours ago | on: Are LLM merge rates not getting better?

That's an interesting claim, but I don't see it in my own work. They have got better but it's very hard to quantify. I just find myself editing their work much less these days (currently using GPT 5.4).

dwedge11 hours ago | parent | next

Without meaning to sound dismissive, because I'm really not intending to, there's also the possibility that you've gotten worse after enough time using them. You're treating yourself as a constant in this, but man cannot walk in the same river twice.

loading story #47349820

loading story #47350666

nkozyra11 hours ago | parent

The problem with evals is the underlying rubric will always be either subjective, or a quantitative score based on something that is likely now baked into the training set directly.

You kind of have to go on "feels" for a lot of this.

#visit	13,080,499
#session	74,665
#live-session	0