Hacker News new | past | comments | ask | show | jobs | submit
Lots of comments either defending this ("it's taking a chance on being the first to build AGI with a proven team") or saying "it's a crazy valuation for a 3 month old startup". But both of these "sides" feel like they miss the mark to me.

On one hand, I think it's great that investors are willing to throw big chunks of money at hard (or at least expensive) problems. I'm pretty sure all the investors putting money in will do just fine even if their investment goes to zero, so this feels exactly what VC funding should be doing, rather than some other common "how can we get people more digitally addicted to sell ads?" play.

On the other hand, I'm kind of baffled that we're still talking about "AGI" in the context of LLMs. While I find LLMs to be amazing, and an incredibly useful tool (if used with a good understanding of their flaws), the more I use them, the more that it becomes clear to me that they're not going to get us anywhere close to "general intelligence". That is, the more I have to work around hallucinations, the more that it becomes clear that LLMs really are just "fancy autocomplete", even if it's really really fancy autocomplete. I see lots of errors that make sense if you understand an LLM is just a statistical model of word/token frequency, but you would expect to never see these kinds of errors in a system that had a true understanding of underlying concepts. And while I'm not in the field so I may have no right to comment, there are leaders in the field, like LeCun, who have expressed basically the same idea.

So my question is, has Sutskever et al provided any acknowledgement of how they intend to "cross the chasm" from where we are now with LLMs to a model of understanding, or has it been mainly "look what we did before, you should take a chance on us to make discontinuous breakthroughs in the future"?

Ilya has discussed this question: https://www.youtube.com/watch?v=YEUclZdj_Sc
Thank you very much for posting! This is exactly what I was looking for.

On one hand, I understand what he's saying, and that's why I have been frustrated in the past when I've heard people say "it's just fancy autocomplete" without emphasizing the awesome capabilities that can give you. While I haven't seen this video by Sutskever before, I have seen a very similar argument by Hinton: in order to get really good at next token prediction, the model needs to "discover" the underlying rules that make that prediction possible.

All that said, I find his argument wholly unconvincing (and again, I may be waaaaay stupider than Sutskever, but there are other people much smarter than I who agree). And the reason for this is because every now and then I'll see a particular type of hallucination where it's pretty obvious that the LLM is confusing similar token strings even when their underlying meaning is very different. That is, the underlying "pattern matching" of LLMs becomes apparent in these situations.

As I said originally, I'm really glad VCs are pouring money into this, but I'd easily make a bet that in 5 years that LLMs will be nowhere near human-level intelligence on some tasks, especially where novel discovery is required.

Watching that video actually makes me completely unconvinced that SSI will succeed if they are hinging it on LLM...

He puts a lot of emphasis on the fact that 'to generate the next token you must understand how', when thats precisely the parlor trick that is making people lose their minds (myself included) with how effective current LLMs are. The fact that it can simulate some low-fidelity reality with _no higher-level understanding of the world_, using purely linguistic/statistical analysis, is mind-blowing. To say "all you have to do is then extrapolate" is the ultimate "draw the rest of the owl" argument.

> but I'd easily make a bet that in 5 years that LLMs will be nowhere near human-level intelligence on some tasks

I wouldn't. There are some extraordinarily stupid humans out there. Worse, making humans dumber is a proven and well-known technology.

I actually echo your exact sentiments. I don't have the street cred but watching him talk for the first few minutes I immediately felt like there is just no way we are going to get AGI with what we know today.

Without some raw reasoning (maybe Neuro-symbolic is the answer maybe not) capacity, LLM won't be enough. Reasoning is super tough because its not as easy as predicting the next most likely token.

>All that said, I find his argument wholly unconvincing (and again, I may be waaaaay stupider than Sutskever, but there are other people much smarter than I who agree). And the reason for this is because every now and then I'll see a particular type of hallucination where it's pretty obvious that the LLM is confusing similar token strings even when their underlying meaning is very different. That is, the underlying "pattern matching" of LLMs becomes apparent in these situations.

So? One of the most frustrating parts of these discussions is that for some bizzare reason, a lot of people have a standard of reasoning (for machines) that only exists in fiction or their own imaginations.

Humans have a long list of cognitive shortcomings. We find them interesting and give them all sorts of names like cognitive dissonance or optical illusions. But we don't currently make silly conclusions like humans don't reason.

The general reasoning engine that makes neither mistake nor contradiction or confusion in output or process does not exist in real life whether you believe Humans are the only intelligent species on the planet or are gracious enough to extend the capability to some of our animal friends.

So the LLM confuses tokens every now and then. So what ?

You are completely mischaracterizing my comment.

> Humans have a long list of cognitive shortcomings. We find them interesting and give them all sorts of names like cognitive dissonance or optical illusions. But we don't currently make silly conclusions like humans don't reason.

Exactly! In fact, things like illusions are actually excellent windows into how the mind really works. Most visual illusions are a fundamental artifact of how the brain needs to turn a 2D image into a 3D, real-world model, and illusions give clues into how it does that, and how the contours of the natural world guided the evolution of the visual system (I think Steven Pinker's "How the Mind Works" gives excellent examples of this).

So I am not at all saying that what LLMs do isn't extremely interesting, or useful. What I am saying is that the types of errors you get give a window into how an LLM works, and these hint at some fundamental limitations at what an LLM is capable of, particularly around novel discovery and development of new ideas and theories that aren't just "rearrangements" of existing ideas.

loading story #41450869
They might never work for novel discovery but that probably can be handled by outside loop or online (in-context) learning. The thing is that 100k or 1M context is a marketing scam for now.
To clarify this, I think it's reasonable that token prediction as a training objective could lead to AGI given the underlying model has the correct architecture. The question really is if the underlying architecture is good enough to capitalize on the training objective so as to result in superhuman intelligence.

For example, you'll have little luck achieving AGI with decision trees no matter what's their training objective.

loading story #41458245
He doesn't address the real question of how an LLM predicting the next token could exceed what humans have done. They mostly interpolate, so if the answer isn't to be found in an interpolation, the LLM can't generate something new.
The argument about AGI from LLMs is not based on the current state of LLMs, but on the rate of progress over the last 5+ years or so. It wasn't very long ago that almost nobody outside of a few niche circles seriously thought LLMs could do what they do right now.

That said, my personal hypothesis is that AGI will emerge from video generation models rather than text generation models. A model that takes an arbitrary real-time video input feed and must predict the next, say, 60 seconds of video would have to have a deep understanding of the universe, humanity, language, culture, physics, humor, laughter, problem solving, etc. This pushes the fidelity of both input and output far beyond anything that can be expressed in text, but also creates extraordinarily high computational barriers.

loading story #41447947
loading story #41448138
> the more that it becomes clear that LLMs really are just "fancy autocomplete", even if it's really really fancy autocomplete

I also don't really see AGI emerging from LLMs any time soon, but it could be argued that human intelligence is also just 'fancy autocomplete'.

loading story #41447884
loading story #41458271
Two interesting points to consider

1. If it’s really amazing autocomplete, is there a distinction between AGI?

Being able to generalize, plan, execute, evaluate and learn from the results could all be seen as a search graph building on inference from known or imagined data points. So far LLMs are being used on all of those and we haven’t even tested the next level of compute power being built to enable its evolution.

2. Fancy autocomplete is a bit broad for the comprehensive use cases CUDA is already supporting that go way beyond textual prediction.

If all information of every type can be “autocompleted” that’s a pretty incredible leap for robotics.

* edited to compensate for iPhone autocomplete, the irony.

> On the other hand, I'm kind of baffled that we're still talking about "AGI" in the context of LLMs.

I'm not. Lots of people and companies have been sinking money into these ventures and they need to keep the hype alive by framing this as being some sort of race to AGI. I am aware that the older I get the more cynical I become, but I bucket all discussions about AGI (including the very popular 'open letters' about AI safety and Skynet) in the context of LLMs into the 'snake oil' bucket.

>"We’ve identified a new mountain to climb that’s a bit different from what I was working on previously. We’re not trying to go down the same path faster. If you do something different, then it becomes possible for you to do something special."

Doesn't really imply let's just do more LLMs.

{"deleted":true,"id":41447841,"parent":41447416,"time":1725468478,"type":"comment"}
I think the plan is to raise a lot of cash and then more and then maybe something comes up that brings us closer to AGI(i.e something better than LLM). The investors know that AGI is not really the goal but they can’t miss the next trillion dollar company.