I don't know how to make sense of this level of investment. I feel that I lack the proper conceptual framework to make sense of the purchasing power of half a trillion USD in this context.
Maybe your calibration isn't poor. Maybe they really are all wrong but there's a tendency here to these these people behind the scenes are all charlatans, fueling hype without equal substance hoping to make a quick buck before it all comes crashing down, but i don't think that's true at all. I think these people really genuinely believe they're going to get there. And if you genuinely think that, them this kind of investment isn't so crazy.
The argument presented in the quote there is: “everyone in AI foundation companies are putting money into AI, therefore we must be near AGI.”
The best evaluation of progress is to use the tools we have. It doesn’t look like we are close to AGI. It looks like amazing NLP with an enormous amount of human labelling.
I don't immediately disagree with you but you just accidentally also described all crypto/NFT enthusiasts of a few years ago.
Are there computing and cryptography problems that the infrastructure could be (publicly or quietly) reallocated to address if the United States found itself in a conflict? Any cryptographers here have a thought on whether hundreds of thousands of GPUs turned on a single cryptographic key would yield any value?
All? Quite a few of the best minds in the field, like Yann LeCun for example, have been adamant that 1) autoregressive LLMs are NOT the path to AGI and 2) that AGI is very likely NOT just a couple of years away.
She interfaces with AI Agents of companies, organizations, friends, family, etc to get things done for you (or to learn from..what's my friends bday his agent tells yours) automagically and she is like a friend. Always there for you at your beckon call like in the movie H.E.R.
Zuckerberg's glasses that can not take selfies will only be complimentary to our AI phones.
That's just my guess and desire as fervent GPT user, as well a Meta Ray Ban wearer (can't take selfies with glasses).
1) reasoning capabilities in latest models are rapidly approaching superhuman levels and continue to scale with compute.
2) intelligence at a certain level is easier to achieve algorithmically when the hardware improves. There's also a larger path to intelligence and often simpler mechanisms
3) most current generation reasoning AI models leverage test time compute and RL in training--both of which can make use of more compute readily. For example RL on coding against compilers proofs against verifiers.
All of this points to compute now being basically the only bottleneck to massively superhuman AIs in domains like math and coding--rest no comment (idk what superhuman is in a domain with no objective evals)
A calculator is superhuman if you're prepared to put up with it's foibles.
One, capable of replacing some large proportion of global gdp (this definition has a lot of obstructions: organizational, bureaucratic, robotic)...
two, difficult to find problems in which average human can solve but model cannot. The problem with this definition is that the distinct nature of intelligence of AI and the broadness of tasks is such that this metric is probably only achievable after AI is already in reality massively superhuman intelligence in aggregate. Compare this with Go AIs which were massively superhuman and often still failing to count ladders correctly--which was also fixed by more scaling.
All in all I avoid the term AGI because for me AGI is comparing average intelligence on broad tasks rel humans and I'm already not sure if it's achieved by current models whereas superhuman research math is clearly not achieved because humans are still making all of progress of new results.
This is true for brute force algorithms as well and has been known for decades. With infinite compute, you can achieve wonders. But the problem lies in diminishing returns[1][2], and it seems things do not scale linearly, at least for transformers.
1. https://www.bloomberg.com/news/articles/2024-12-19/anthropic...
2. https://www.bloomberg.com/news/articles/2024-11-13/openai-go...
What would you say is the strongest evidence for this statement?
I still have a pretty hard time getting it to tell me how many sisters Alice has. I think this might be a bit optimistic.
My sense anecdotally from within the space is yes people are feeling like we most likely have a "straight shot" to AGI now. Progress has been insane over the last few years but there's been this lurking worry around signs that the pre-training scaling paradigm has diminishing returns.
What recent outputs like o1, o3, DeepSeek-R1 are showing is that that's fine, we now have a new paradigm around test-time compute. For various reasons people think this is going to be more scalable and not run into the kind of data issues you'd get with a pre-training paradigm.
You can definitely debate on whether that's true or not but this is the first time I've been really seeing people think we've cracked "it", and the rest is scaling, better training etc.
My problem with this is that people making this statement are unlikely to be objective. Major players are in fundraising mode, and safety folks are also incentivised to be subjective in their evaluation.
Yesterday I repeatedly used OpenAI’s API to summarise a document. The first result looked impressive. However, comparing repeated results revealed that it was missing major points each time, in a way a human would certainly not. In the surface the summary looked good, but careful evaluation indicated a lack of understanding or reasoning.
Don’t get me wrong, I think AI is already transformative, but I am not sure we are close to AGI. I hear a lot about it, but it doesn’t reflect my experience in a company using and building AI.
We've seen with oAI and Anthropic, and rumoured with Google, that holding your "best" model and using it to generate datasets for smaller but almost as capable models is one way to go forward. I would say that this shows the "big models" are more capable than it would seem and that they also open up new avenues.
We know that Meta used L2 to filter and improve its training sets for L3. We are also seeing how "long form" content + filtering + RL leads to amazing things (what people call "reasoning" models). Semantics might be a bit ambitious, but this really opens up the path towards -> documentation + virtual environments + many rollouts + filtering by SotA models => new dataset for next gen models.
That, plus optimisations (early exit from meta, titans from google, distillation from everyone, etc) really makes me question the "we've hit a wall" rhetoric. I think there are enough tools on the table today to either jump the wall, or move around it.
This sort of $100-500B budget doesn't sound like training cluster money, more like anticipating massive industry uptake and multiple datacenters running inference (with all of corporate America's data sitting in the cloud).
I've read that some datacenters run mixed generation GPUs - just updating some at a time, but not sure if they all do that.
It'd be interesting to read something about how updates are typically managed/scheduled.
I think what's been going on is compute/$ has been exponentially rising for decades in a steady way and has recently passed the point that you can get human brain level compute for modest money. The tendency has been once the compute is there lots of bright PhDs get hired to figure algorithms to use it so that bit gets sorted in a few years. (as written about by Kurzweil, Wait But Why and similar).
So it's not so much brute forcing AGI so much that exponential growth makes it inevitable at some point and that point is probably quite soon. At least that seems to be what they are betting.
The annual global spend on human labour is ~$100tn so if you either replace that with AGI or just add $100tn AGI and double GDP output, it's quite a lot of money.
The thing about investments, specifically in the world of tech startups and VC money, is that speculation is not something you merely capitalize on as an investor, it's also something you capitalize on as a business. Investors desperately want to speculate (gamble) on AI to scratch that itch, to the tune of $500 billion, apparently.
So this says less about, 'Are we close to AGI?' or, 'Is it worth it?' and more about, 'Are people really willing to gamble this much?'. Collectively, yes, they are.
Can't answer that question, but, if the only thing to change in the next four years was that generation got cheaper and cheaper, we haven't even begun to understand the transformative power of what we have available today. I think we've felt like 5-10% of the effects that integrating today's technology can bring, especially if generation costs come down to maybe 1% of what they currently are, and latency of the big models becomes close to instantaneous.
Remember Trump's BIG WIN of Foxconn investing $10B to build a factory in Wisconsin, creating 13000 jobs?
That was in 2017. 7 years later, it's employing about 1000 people if that. Not really clear what, if anything, is being made at the partially-built factory. [0]
And everyone's forgotten about it by now.
I expect this to be something along those lines.
[0] https://www.jsonline.com/story/money/business/2023/03/23/wha...
Or they think the odds are high enough that the gamble makes sense. Even if they think it's a 20% chance, their competitors are investing at this scale, their only real options are keep up or drop out.
“My NEW Official Trump Meme is HERE! It's time to celebrate everything we stand for: WINNING! Join my very special Trump Community. GET YOUR $TRUMP NOW.”
Your calibration is probably fine, stargate is not a means to achieve AGI, it’s a means to start construction on a few million square feet of datacenters thereby “reindustrializing America”
twitter hype is out of control again.
we are not gonna deploy AGI next month, nor have we built it.
we have some very cool stuff for you but pls chill and cut your expectations 100x!
I realize he wrote a fairly goofy blog a few weeks ago, but this tweet is unambiguous: they have not achieved AGI.It rather means that they see their only chance for substantial progress in Moar Power!