Story Detail of id 48608127 | Liveview Hacker News

stalfie8 hours ago | on: GPT-5.5 hallucinates 3x more than MIT-licensed GLM-5.2

One thing I wonder about hallucinations, is that it seems on the surface that it is an easy problem for RLVR to target. Since you're already generating enormous amounts of reasoning traces which are verified by correct answers, just have "don't know" as an option as a valid answer, and on problems where none of the thousands of reasoning traces led to a correct answer, just promote the traces that led to the "don't know" answer as training data. Essentially teaching the model that "I don't know" is a valid answer.

Sam Altman himself had a blog post about this a while ago that seemed to suggest this thought, so I guess it's obvious to everyone. But if that is so I assume it's just not as easy in practice.

loading story #48608539

loading story #48608392

loading story #48609419

loading story #48608562

loading story #48609067

loading story #48608583

loading story #48608189

loading story #48608177

#visit	13,965,279
#session	74,665
#live-session	0