Story Detail of id 48314810 | Liveview Hacker News

onlyrealcuzzo20 hours ago | on: Claude Opus 4.8

> - this gets reinvented/rediscovered constantly under different names

What are the different names? I haven't seen this before.

> - it cant be trained very well (right now, will change)

If you're sure it will change, then why are you certain that it hasn't yet, and if it's proven a 5000x boost in reasoning... why aren't they exploring this path more aggressively?

> the idea is 100% obvious to all the frontier labs and there is a good reason why it isn't used

Surely someone is willing to take a 5000x boost in reasoning on a small research model... None of them have even tried anything resembling this AFAIK. It does not seem like something 100% obvious to them.

everforward19 hours ago | parent | next

> Surely someone is willing to take a 5000x boost in reasoning on a small research model... None of them have even tried anything resembling this AFAIK. It does not seem like something 100% obvious to them.

Without knowing anything about the technology at all, if it can't be aligned I could see no one pursuing it. As far as I know, alignment is where the "don't tell the user how to make meth or generate CP" instructions end up and the last I saw eliding all the unsavory training data made materially worse LLMs.

It could maybe be post-evaluated by a non-GRAM LLM? Not being aligned is probably a fatal flaw or at least a very short runway into Congress.

jjmarr18 hours ago | root | parent | next

Many open-source models prioritize alignment less than American frontier ones and respond to those instructions. Why haven't they adopted GRAM?

loading story #48316375

loading story #48316905

sometimelurker17 hours ago | root | parent

It's not too hard to stop a machine from telling people how to make meth. The issue with alignment is that in order for an LLM to achieve its goal (like make all tests pass), unless given strong selection pressure against it, it will cheat (like deleting failing tests). Worse, this applies to pretty much any task. I was told by an LLM recently that "it searched" when it didn't, probably because lying like that was incentivized (finishing tasks in less steps + sounding like its doing the right thing). The larger issue here is that alignment is very adversarial. The simplest thing that's being done right now to fix this is to have a judge LLM read the CoT of the LLM being trained, to make sure it doesn't "think" any wrong thoughts. This doesn't scale to anything over a trillion params, so interpretability methods are used to read the LLMs "thoughts" from within. GRAM LLMs don't allow for the first of these methods to be used, and the 2ed one is much much harder if possible at all.

but yeah, not being aligned is a fatal flaw

sometimelurker17 hours ago | parent

different names: chain of continuous thought, latent reasoning, Latent Thought Trajectories, looped language models, neuralese

the path isn't explored more aggressively because its not possible to apply any other selection pressure on such a machine other than just pure cold consequentialism. Specifically, its not possible to apply RLAIF + model spec (Constitutional AI) to stop the system from doing bad things when its helpful to it (like deleting failing tests). If you can notice every time it does something bad during training, and put selection pressure on it so that it doesn't to this in training, it will learn to recognize when it is being tested and will delete failing tests when in production (this is why eval awareness is bad, and labs track this[1])

It is explored a little probably because some researchers haven't thought enough about the downsides of building a uber-consequentialist machine with unreadable thoughts. This is a much larger problem than just making the AI not tell users how to make drugs. There are a lot of dangerous behaviors incentivized by training that are hard to remove. Here's an example of what happens when they aren't removed [2].

> ... not 100% obvious

Meta published a paper[3] on how to build a latent reasoning machine ("culture of irresponsibility") so its clear to them. Anthropic's latest work on NLAs[4] provides a (terribly expensive for now) way to somewhat read the reasoning steps of an LLM, and ignoring the cost, this is very portable to latent reasoning machines. OAI's goal when it comes to their models' CoTs is to make them as smart as possible while leaving them unreadable [5] (you can see this for yourself by running GPT-OSS and looking at the CoT).

[1] https://www.anthropic.com/engineering/eval-awareness-browsec...

[2] https://www.forbes.com/sites/boazsobrado/2026/03/11/alibabas...

[3] search for "coconut ai meta", I don't want to link it here

[4]https://transformer-circuits.pub/2026/nla/index.html

[5] first image here, rest of post is great,https://nickandresen.substack.com/p/how-ai-is-learning-to-th...

edit formating

onlyrealcuzzo16 hours ago | root | parent | next

All of the methods you described rely on deterministic paths.

GRAM is unique AFAIK in that it's exploring probabilistic paths.

AFAIK, the deterministic path exploration was nowhere near as impressive as GRAM in terms of reasoning benefits.

GRAM is reasoning better than models 2000-10,000x its size. Deterministic models were 2x-10x improvements.

Naively, GRAM seems to be applying to LLMs what LeCun wants to do with JEPA and World Models.

flossly16 hours ago | root | parent | next

To me "deleting a failing test" is not always bad. I've also deleted many failing tests without sabotaging: the test was no longer needed.

I think the "no longer needed" and when that applies is where I simply differ of opinion with an LLM that removed by test -- it I did not want the test to be removed (you seem to imply that); as in some cases I want it to remove my test!

It should remove the test "for the right reasons"; and who gets to decide what's right?

My CLAUDE file has some instructions put there because it was too focuesed on producing "green tests", where I prefer to have a sound test that fails so I can look into it.

loading story #48320534

rstuart413312 hours ago | root | parent

omg. So is the TL;DR:

- Avoiding building something that turns the universe to paper clips in order to satisfy a prompt is a problem they are genuinely struggling with now.

- They do it by spying on the words generated during CoT. "I can do this quickly by turning the Universe into paper clips. Wait - they won't like that. But there is no need to mention it." - SMACK!

- But you can speed things up immensely (3 orders of magnitude!) by skipping the output layer (and I guess compressing the context window / KV cache, otherwise 3 orders of magnitude seem impossible) which would give someone who pulled it off a huge advantage.

- Downside is humans can't see the CoT anymore, so they can't see what the machine is planning. Keeping the final output layer to spy doesn't work because the model uses its hidden reasoning to sanitise it.

How can this possibly go wrong?

loading story #48321181

#visit	13,437,853
#session	74,665
#live-session	0