Pretraining Language Models via Neural Cellular Automata

https://hanseungwook.github.io/blog/nca-pre-pre-training/

83shmublu | 4 days ago | 16 | HN

Reminds me of "Universal pre-training by iterated random computation" https://arxiv.org/pdf/2506.20057, with bit less formal approach.

I wonder if there is a closed-form solution for those kinds of initialization methods (call them pre-training if you wish). A solution that would allow attention heads to detect a variety of diverse patterns, yet more structured than random init.

loading story #47443230

stanfordkid4 hours ago | parent | next

I did a similar project but using 3D fractals I found on shadertoy feeding into ViTs. They are extremely simple iterative functions that produce a ton of scene like complexity.

I have a pet theory that the visual cortex when developing is linked to some kind of mechanism such as this. You just need proteins that create some sort of resonating signal that feed into the neurons as they grow (obviously this is hand-wavy) but similar feedback loops guide nervous system growth in Zebra fish for example.

loading story #47441637

loading story #47440287

loading story #47442478

andai4 hours ago | parent | next

> The key: since every sequence has a unique latent rule, the model must infer that rule in-context to predict what comes next. This in-context learning ability underpins many of the key reasoning capabilities observed in language models.

This is a remarkable paper. This is the first time I've heard someone training the actual thing we're trying to get this stuff to do!

---

> This raises a radical question: Is natural language the only path to intelligence?

Of course not! We have octopi, ravens etc., which in many domain display higher intelligence than frontier AIs.

"Embodied reasoning" (genetic algorithm brute force solving physical tasks for a billion years, to name one solution) is definitely one very practical form of intelligence, although we're taking some shortcuts in replicating it.

I'm wondering if simplified analog tasks like Box2D puzzled would help too (or perhaps even simpler? Hanoi? Block worlds?). I know many companies are using simulations of 3D worlds for that.

What I don't understand is how that can integrate with the LLM (physical intelligence would seem to require specialized circuitry, if only for the latency). But maybe once we have good specialized models, LLMs can be trained on their synthetic data?

voxleone6 hours ago | parent | next

Neural cellular automata are interesting because they shift learning from “predict tokens” to “model state evolution.” That feels much closer to a transition-based view of systems, where structure emerges from repeated local updates (transitions) rather than being encoded explicitly.

I'm working on a theoretical/computational framework, the Functional Universe, intended for modeling physical reality as functional state evolution. i would say it could be used to replicate your CA process. Won't link it here to signal my good faith discussing this issue - it's on my GH.

loading story #47440239

loading story #47442033

dzink5 hours ago | parent | next

“The long-term vision is: foundation models that acquire reasoning from fully synthetic data, then learn semantics from a small, curated corpus of natural language. This would help us build models that reason without inheriting human biases from inception.”

loading story #47439306

dmos624 hours ago | parent | next

Honestly, I never thought about reasoning this way, but it's kind of obvious now that someone did it. Very interesting.

builderhq_io4 hours ago | parent | next

[dead]

Heer_J4 hours ago | parent

[dead]

#visit	13,174,402
#session	74,665
#live-session	0