Story Detail of id 48397629 | Liveview Hacker News

sixeyes9 hours ago | on: They’re made out of weights

oh yeah! i recall a paper linked here not so long ago, where it was shown that the dendrites of a neuron do computations themselves. The "weight per neuron" is very simplistic then. At the very least, each actual neuron is a network of weights.

https://www.quantamagazine.org/neural-dendrites-reveal-their...

ACCount378 hours ago | parent

I'm partial to "modern ML weights are much closer to 1:1 capacity mapping to synapse count than to neuron count". A single biological neuron is closer to 100 or even 1000 weights worth of ANN than to 1 weight worth of ANN.

In which case: modern LLMs are still running in a capacity-starved regime!

Even Mythos 5, the 10-trillion monster LLM, the scaling law boogeyman, the harbinger of Vera Rubin NVL72, doesn't quite rise to 100T-to-1000T of synapses. Anything the light of today's AI touches still lives in the shadow of what evolution has managed to cram into a single human skull.

We're arguing about the limitations of AI while our best AIs are still very subhuman in the scale dimension. The one dimension we know how to push. And it's already this tight.

galangalalgol7 hours ago | root | parent | next

10T is about a crows worth. The mythos count doesn't include any diffusion model. But the crows count includes all its visual processing. And tactile. Touch uses up enough that they use skin surface area to normalize across animals when doing comparisons. It is one of the reasons suggested to explain how crows exhibit tool use and language with only 10T. We have a lot more skin than crows, and indeed far more than mythos.

SlinkyOnStairs7 hours ago | root | parent

> A single biological neuron is closer to 100 or even 1000 weights worth of ANN than to 1 weight worth of ANN.

Even those comparisons need to be cautioned. The complexity of biology is enormous, and more importantly yet, it's simply not comparable. And doing so invited a bunch of bad assumptions.

An ANN could quite probably model a single in vitro neuron with reasonable accuracy. Whether that requires a hundred or a hundred million nodes isn't terribly relevant.

But the way neurons combine in vivo is completely unlike the way machine learning systems are built. Both "locally" in how neurons interface which is vastly more complex than a weighted sum of inputs, and the macro scale interactions of hormones and other chemicals.

It's not even a given that large numbers of neurons will create the emergent behaviour of human intelligence; Elephants have significantly more neurons, but they're not the triple galaxy brains writing all our science papers. Other animal intelligence similarly is only loosely correlated with brain complexity. (Heck, not to be forgotten is the other end of the scale. Plenty of microscopic life that manages shockingly complex behaviour without any dedicated neurons)

This also applies to ANNs. There's no reason to expect that stuffing enough matrix multiplications into a program will make it intelligent or turn out conscious.

Really, the history of machine learning suggests the opposite; That the big gains are primarily had in architectural changes.

In this regard, I find the talk of the "limits of AI" quite credible. LLMs have already hit the diminishing returns on their growth, and even reasoning/agentic models display failure modes that confirm they're not "thinking" in the ways that humans do.

This is not to say that we've hit the final limits of what AI in the broad sense can do, it's just that the next advancement won't be "LLM but even bigger"

ACCount376 hours ago | root | parent

Not really. The history of "big gains" of machine learning is: put together a simple architecture that makes few assumptions but scales well. Then up the data and compute by 2 OOMs. By itself, the new architecture underperforms. Paired with the bitter lesson, however?

Don't make assumptions. Make a setup where the gradient descent can make them for you.

Empirically? LLMs are nowhere near "the wall". We've been hearing "the wall is nigh" since 2020. Six years in, we're still scaling LLMs, and the graveyards are full of "LLM killers". The system that kills the LLM is always a bigger, badder LLM, and never a new revolutionary architecture. The scaling doesn't just keep working - it works so well that it's seen as the only viable path forward at the frontier of reasoning and agentic work. Or even outside it. ChatGPT Images 2.0 is an image model with an agentic LLM at its core - generational gains in compositional capability.

For just about every "failure mode that confirms they're not thinking", you see one of two things. The first is that a new LLM releases a few months after and the "fundamental" issue abruptly goes away. The second is that we take a good, long look at a human, and find that the human also fails like this - and thus, "not thinking". Often both! Always funny when it's both.

One thing that's very biologically distinct is: local connectivity. In a GPU, global connectivity is cheap. In a brain, it's prohibitively expensive. The brain has no true backpropagation because it has no true global connectivity, and has to make do with local rules. A GPU is a strictly more expressive substrate connectivity-wise. So any point in the design of a computational substrate where you could remove complexity or increase performance by adding more connectivity? Silicon advantage. The brain isn't a "strictly better computational substrate" - it makes different tradeoffs. Which tradeoffs are better for attaining intelligence is an open question.

And, sure. Having a substrate with a capacity for intelligence doesn't mean having intelligence. No elephant has ever learned to code. The problem is: LLMs already did! LLMs already compete with humans on just about every task that was once thought to "require human intelligence". They don't always win - but they perform significantly above chance, and often above an average non-expert human.

So, my bet is on "LLM but even bigger". If there's a point where LLMs begin to lag behind and novel architectures get a sharp advantage, we are yet to hit it.

f_klem4 hours ago | root | parent

> For just about every "failure mode that confirms they're not thinking", you see one of two things. The first is that a new LLM releases a few months after and the "fundamental" issue abruptly goes away. The second is that we take a good, long look at a human, and find that the human also fails like this - and thus, "not thinking". Often both! Always funny when it's both. The way machines 'don't think' or 'fail' is fundamentally different from the way humans don't think or fail. In any case, the way LLMs learn and human beings learn is completely different. There is no actual clue that we are approaching any inflection point in machine 'learning'.

> So, my bet is on "LLM but even bigger". If there's a point where LLMs begin to lag behind and novel architectures get a sharp advantage, we are yet to hit it. We are already hearing this 'we are about to hit it' since the late 60s. The difference now is that the market is willingly investing insane amounts of money to make it possible. But again, there is no philosophical, theoretical, epistemological or biological clue that we are getting any closer to human intelligence level. What we did observe in the last decade though, is that we can build enormous machines that can statistically mimic statistical human outputs. Language and images being some of them. But that is not thinking.

loading story #48401418

#visit	13,565,914
#session	74,665
#live-session	0