Hacker News new | past | comments | ask | show | jobs | submit
This is so uncannily close to the problems we're encountering at Pioneer, trying to make human+LLM workflows in high stakes / high complexity situations.

Humans are so smart and do so many decisions and calculations on the subconscious/implicit level and take a lot of mental shortcuts, so that as we try to automate this by following exactly what the process is, we bring a lot of the implicit thinking out on the surface, and that slows everything down. So we've had to be creative about how we build LLM workflows.

Language seems to be confused with logic or common sense.

We've observed it previously in psychiatry(and modern journalism, but here I digress) but LLMs have made it obvious that grammatically correct, naturally flowing language requires a "world" model of the language and close to nothing of reality, spatial understanding? social clues? common sense logic? or mathematical logic? All optional.

I'd suggest we call the LLM language fundament a "Word Model"(not a typo).

Trying to distil a world model out of the word model. A suitable starting point for a modern remake of Plato's cave.

loading story #42006457
Language is the tool we use to codify a heuristic understanding of reality. The world we interact with daily is not the physical one, but an ideological one constructed out of human ideas from human minds. This is the world we live in and the air we breath is made of our ideas about oxygenation and partly of our concept of being alive.

It's not that these "human tools" for understanding "reality" are superfluous, it's just that they ar second-order concepts. Spatial understandings, social cues, math, etc. Those are all constructs built WITHIN our primary linguistic ideological framing of reality.

To put this in coding terms, why would an LLM use rails to make a project when it could just as quickly produce a project writing directly to the socket.

To us these are totally different tasks and would actually require totally different kinds of programmers but when one language is another language is everything, the inventions we made to expand the human brain's ability to delve into linguistic reality are no use.

I can suggest one reason why LLM might prefer writing in higher level language like Ruby vs assembly. The reason is the same as why physicists and mathematicians like to work with complex numbers using "i" instead of explicit calculation over 4 real numbers. Using "i" allows us to abstract out and forget the trivial details. "i" allows us to compress ideas better. Compression allows for better prediction.
except LLMs are trained on higher level languages. Good luck getting you LLM to write your app entirely in assembly. There just isn’t enough training data.
But in theory, with what training data there IS available on how to write in assembly, combined with the data available on what's required to build an app, shouldn't a REAL AI be able to synthesize the knowledge necessary to write a webapp in assembly? To me, this is the basis for why people criticize LLMs, if something isn't in the data set, it's just not conceivable by the LLM.
Yes. There is just no way of knowing how many more watts of energy it may need to reach that level of abstraction and depth - maybe on more watt, maybe never.

And the random noise in the process could prevent it from ever being useful, or it could allow it to find a hyper-efficient clever way to apply cross-language transfer learning to allow a 1->1 mapping of your perfectly descriptive prompt to equivalent ASM....but just this one time.

There is no way to know where performance per parameter plateaus; or appears to on a projection, or actually does... or will, or deceitful appears to... to our mocking dismay.

As we are currently hoping to throw power at it (we fed it all the data), I sure hope it is not the last one.

There isn't that much training data on reverse engineering Python bytecode, but in my experiments ChatGPT can reconstruct a (unique) Python function's source code from its bytecode with high accuracy. I think it's simulating the language in the way you're describing.
loading story #42007669
It’s in the name: Language Model, nothing else.
I think the previous commenter chose "word" instead of "language" to highlight that a grammatically correct, naturally flowing chain of words is not the same as a language.

Thus, Large Word Model (LWM) would be more precise, following his argument.

loading story #42007894
loading story #42007318
loading story #42008064
Bingo, great reply! This is what I've been trying to explain to my wife. LLM's use fancy math and our language examples to reproduce our language but have no thoughts are feelings.
Yes but the initial training sets did have thoughts and feeling behind them and those are reflected back to the user in the output (with errors)
loading story #42007812
loading story #42009166
loading story #42005474
loading story #42001445
loading story #42006102