Hacker News new | past | comments | ask | show | jobs | submit
Can someone explain exactly what is the "unknown" of neural networks? We built them, we know what they comprise of and how they work. Yes, we can't map out every single connection between nodes in this "multilayer perceptron" but don't we know how these connections are formed?
Sota LLMs like GPT-4o can natively understand b64 encoded text. Now we have algorithms that can decode and encode b64 text. Is that what GPT-4o is doing ? Did training learn that algorithm ? Clearly not or at least not completely because typos in b64 that would destroy any chance of extracting meaning in the original text for our algorithms are barely an inconvenience for 4o.

So how is it decoding b64 then ? We have no idea.

We don't built Neural Networks. Not really. We build architectures and then train them. Whatever they learn is outside the scope of human action beyond supplying the training data.

What they learn is largely unknown beyond trivial toy examples.

We know connections form, we can see the weights, we can even see the matrices multiplying. We don't know what any of those calculations are doing. We don't know what they mean.

Would an alien understand C Code just because he could see it executing ?

loading story #41522025
loading story #41521685
We don't know what each connection means, what information is encoded in each weight. We don't know how it would behave differently if each of the million or trillion weights was changed.

Compare this to dictionaey, where it's obvious what information is on each page and each line.

Skipping some detail: the model applies many high-dimensional functions to the input, and we don't know the reasoning for why these functions solve the problem. Reducing the dimension of the weights to human-readable values is non-trivial, and multiple neurons interact in unpredictable ways.

Interpretability research has resulted in many useful results and pretty visualizations[1][2], and there are many efforts to understand Transformers[3][4] but we're far from being able to completely explain the large models currently in use.

[1] - https://distill.pub/2018/building-blocks/

[2] - https://distill.pub/2019/activation-atlas/

[3] - https://transformer-circuits.pub/

[4] - https://arxiv.org/pdf/2407.02646

The brain serves as a useful analogy, even though LLMs are not brains. Just as we can’t fully understand how we think by merely examining all of our neurons, understanding LLMs requires more than analyzing their individual components, though decoding LLMs is most likely easier, which doesn't mean easy.
We know how they are formed(and how to form them), we don't know why forming in that particular way solves the problem at hand.

Even this characterization is not strictly valid anymore, there is a great deal of research into what's going on inside the black box. The problem was never that it was a black box(we can look inside at any time), but that it was hard to understand. KANs help some of that be placed into mathematical formulation. Generating mappings of activations over data similarly grants insight.

* Given the training data, and the architecture of the network, why does SGD with backprop find the given f? vs. any other of an infinite set.

* Why are there are a set of f each with 0-loss that work?

* Given the weight space, and an f within it, why/when is a task/skill defined as a subset of that space covered by f?

I think a major reasons why these are hard to answer is that it's assumed that NNs are operating within an inferential statistical context (ie., reversing some latent structure in the data). But they're really bad at that. In my view, they are just representation-builders that find proxy representations in a proxy "task" space (def, aprox, proxy = "shadow of some real structure, as captured in an unrelated space").

We know the process to train a model, but when a model makes a prediction we don't know exactly "how" it predicts the way it does.

We can use the economy as an analogy. No single person really understands the whole supply chain. But we know that each person in the supply chain is trying to maximize their own profit, and that ultimately delivers goods and services to a consumer.

There’s a ton of research going into analysing and reverse engineering NNs, this “they’re mysterious black boxes and forever inscrutable” narrative is outdated.