Story Detail of id 47349709 | Liveview Hacker News

nkozyra12 hours ago | on: Are LLM merge rates not getting better?

The problem with evals is the underlying rubric will always be either subjective, or a quantitative score based on something that is likely now baked into the training set directly.

You kind of have to go on "feels" for a lot of this.

#visit	13,081,862
#session	74,665
#live-session	0