Story Detail of id 47350122 | Liveview Hacker News

yorwba11 hours ago | on: Are LLM merge rates not getting better?

Yes, I think this is basically an instance of the "emergent abilities mirage." https://arxiv.org/abs/2304.15004

If you measure completion rate on a task where a single mistake can cause a failure, you won't see noticeable improvements on that metric until all potential sources of error are close to being eliminated, and then if they do get eliminated it causes a sudden large jump in performance.

That's fine if you just want to know whether the current state is good enough on your task of choice, but if you also want to predict future performance, you need to break it down into smaller components and track each of them individually.

Bombthecat3 hours ago | parent

That's how the public perceive it though.

It's useless and never gets better until it suddenly, unexpecty got good enough.

loading story #47358276

#visit	13,081,882
#session	74,665
#live-session	0