Story Detail of id 41521253 | Liveview Hacker News

mjburgess4 months ago | on: Kolmogorov-Arnold networks may make neural networks more understandable

* Given the training data, and the architecture of the network, why does SGD with backprop find the given f? vs. any other of an infinite set.

* Why are there are a set of f each with 0-loss that work?

* Given the weight space, and an f within it, why/when is a task/skill defined as a subset of that space covered by f?

I think a major reasons why these are hard to answer is that it's assumed that NNs are operating within an inferential statistical context (ie., reversing some latent structure in the data). But they're really bad at that. In my view, they are just representation-builders that find proxy representations in a proxy "task" space (def, aprox, proxy = "shadow of some real structure, as captured in an unrelated space").

#visit	11477821
#session	45276
#live-session	0