My problem with this is that people making this statement are unlikely to be objective. Major players are in fundraising mode, and safety folks are also incentivised to be subjective in their evaluation.
Yesterday I repeatedly used OpenAI’s API to summarise a document. The first result looked impressive. However, comparing repeated results revealed that it was missing major points each time, in a way a human would certainly not. In the surface the summary looked good, but careful evaluation indicated a lack of understanding or reasoning.
Don’t get me wrong, I think AI is already transformative, but I am not sure we are close to AGI. I hear a lot about it, but it doesn’t reflect my experience in a company using and building AI.
To your point, yeah the models still suck in some surprising ways, but again it's that thing of they're the worst they're ever going to be, and I think in particular on the reasoning issue a lot of people are quite excited that RL over CoT is looking really really promising for this.
I agree with your broader point though that I'm not sure how close we are and there's an awful lot of noise right now
“The worst they’re going to be” line is a bit odd. I hear it a lot, but surely it’s true of all tech? So why are we hearing it more now? Perhaps that is a sign of hype?
Secondly, I think there's a tendency in AI for some ppl to look at failures of models and attribute it to some fundamental limitation of the approach, rather than something that future models will solve. So I think the line also gets used as short-hand for "Don't assume this limitation is inherent to the approach". I think in other areas of tech there's less of a tendency to try to write off entire areas because of present-day limitations, hence the line coming up more often
So you're right that the line is kind of universally applicable in tech, I guess I just think the kinds of bad arguments that warrant it as a rejoinder are more common around AI?
If anyone has experience on getting this right, I would like to know how you do it.