Hacker News new | past | comments | ask | show | jobs | submit
There is https://magazine.sebastianraschka.com/p/technical-deepseek which shows an evolution in deepseek family
> The goal of the proof verifier (LLM 2) is to check the generated proofs (LLM 1), but who checks the proof verifier? To make the proof verifier more robust and prevent it from hallucinating issues, they developed a third LLM, a meta-verifier.
The one thing I didn't quite understand (and wasn't mentioned in their paper unless I missed it), is why you can't keep stacking turtles. You probably get diminishing returns at some point, but why not have a meta-meta-verifier?