Hacker News new | past | comments | ask | show | jobs | submit
Let me avoid the use of the word AGI here because the term is a little too loaded for me these days.

1) reasoning capabilities in latest models are rapidly approaching superhuman levels and continue to scale with compute.

2) intelligence at a certain level is easier to achieve algorithmically when the hardware improves. There's also a larger path to intelligence and often simpler mechanisms

3) most current generation reasoning AI models leverage test time compute and RL in training--both of which can make use of more compute readily. For example RL on coding against compilers proofs against verifiers.

All of this points to compute now being basically the only bottleneck to massively superhuman AIs in domains like math and coding--rest no comment (idk what superhuman is in a domain with no objective evals)

You can't block AGI on a whim and then deploy 'superhuman' without justification.

A calculator is superhuman if you're prepared to put up with it's foibles.

loading story #42787677
> All of this points to compute now being basically the only bottleneck to massively superhuman AIs

This is true for brute force algorithms as well and has been known for decades. With infinite compute, you can achieve wonders. But the problem lies in diminishing returns[1][2], and it seems things do not scale linearly, at least for transformers.

1. https://www.bloomberg.com/news/articles/2024-12-19/anthropic...

2. https://www.bloomberg.com/news/articles/2024-11-13/openai-go...

loading story #42791156
> 1) reasoning capabilities in latest models are rapidly approaching superhuman levels and continue to scale with compute.

What would you say is the strongest evidence for this statement?

loading story #42790461
>reasoning capabilities in latest models are rapidly approaching superhuman levels and continue to scale with compute

I still have a pretty hard time getting it to tell me how many sisters Alice has. I think this might be a bit optimistic.

loading story #42797218
What is the evidence for 1) ? I thought that the latest models were getting "somewhere" with fairly trivial reasoning tests like ARC-1
loading story #42842631