Hacker News new | past | comments | ask | show | jobs | submit
> Is the key idea here that current AI development has figured out enough to brute force a path towards AGI?

My sense anecdotally from within the space is yes people are feeling like we most likely have a "straight shot" to AGI now. Progress has been insane over the last few years but there's been this lurking worry around signs that the pre-training scaling paradigm has diminishing returns.

What recent outputs like o1, o3, DeepSeek-R1 are showing is that that's fine, we now have a new paradigm around test-time compute. For various reasons people think this is going to be more scalable and not run into the kind of data issues you'd get with a pre-training paradigm.

You can definitely debate on whether that's true or not but this is the first time I've been really seeing people think we've cracked "it", and the rest is scaling, better training etc.

> My sense anecdotally from within the space is yes people are feeling like we most likely have a "straight shot" to AGI now

My problem with this is that people making this statement are unlikely to be objective. Major players are in fundraising mode, and safety folks are also incentivised to be subjective in their evaluation.

Yesterday I repeatedly used OpenAI’s API to summarise a document. The first result looked impressive. However, comparing repeated results revealed that it was missing major points each time, in a way a human would certainly not. In the surface the summary looked good, but careful evaluation indicated a lack of understanding or reasoning.

Don’t get me wrong, I think AI is already transformative, but I am not sure we are close to AGI. I hear a lot about it, but it doesn’t reflect my experience in a company using and building AI.

Yeah obviously motivations are murky and all over the place, no one's free of bias. I'm not taking a strong stance on whether they're right or not or how much of it is motivated reasoning, I just think at least quite a bit is genuine (I'm mainly basing this off researchers I know who have a track record of being very sober and "boring" rather than the flashy Altman types)

To your point, yeah the models still suck in some surprising ways, but again it's that thing of they're the worst they're ever going to be, and I think in particular on the reasoning issue a lot of people are quite excited that RL over CoT is looking really really promising for this.

I agree with your broader point though that I'm not sure how close we are and there's an awful lot of noise right now

Thanks, that’s helpful.

“The worst they’re going to be” line is a bit odd. I hear it a lot, but surely it’s true of all tech? So why are we hearing it more now? Perhaps that is a sign of hype?

Yeah that's a fair point! It's def a more general tech thing, but I think there are a couple specific reasons why it comes up more here though. Firstly, I think most tech does not improve at the insane rate that AI has been historically, so people's perception of capabilities become out of date just incredibly rapidly here (think about how long people we're banging on about "AI can't draw hands!" well after better models came out that could). If you think of the line as a way to say "don't anchor on what it can do today!" then it feels more appropriate to go on about this more for a more rapidly-changing field

Secondly, I think there's a tendency in AI for some ppl to look at failures of models and attribute it to some fundamental limitation of the approach, rather than something that future models will solve. So I think the line also gets used as short-hand for "Don't assume this limitation is inherent to the approach". I think in other areas of tech there's less of a tendency to try to write off entire areas because of present-day limitations, hence the line coming up more often

So you're right that the line is kind of universally applicable in tech, I guess I just think the kinds of bad arguments that warrant it as a rejoinder are more common around AI?

Summarizing is quite difficult. You need to keep the salient points and facts.

If anyone has experience on getting this right, I would like to know how you do it.

I agree with your take, and actually go a bit further. I think the idea of "diminishing returns" is a bit of a red herring, and it's instead a combination of saturated benchmarks (and testing in general) and expectations of "one llm to rule them all". This might not be the case.

We've seen with oAI and Anthropic, and rumoured with Google, that holding your "best" model and using it to generate datasets for smaller but almost as capable models is one way to go forward. I would say that this shows the "big models" are more capable than it would seem and that they also open up new avenues.

We know that Meta used L2 to filter and improve its training sets for L3. We are also seeing how "long form" content + filtering + RL leads to amazing things (what people call "reasoning" models). Semantics might be a bit ambitious, but this really opens up the path towards -> documentation + virtual environments + many rollouts + filtering by SotA models => new dataset for next gen models.

That, plus optimisations (early exit from meta, titans from google, distillation from everyone, etc) really makes me question the "we've hit a wall" rhetoric. I think there are enough tools on the table today to either jump the wall, or move around it.

Yeah that's called wishful thinking when it's not straight up pipe dreams. All these people have horses in the race