Story Detail of id 42769049 | Liveview Hacker News

I love that they included some unsuccessful attempts. MCTS doesn't seem to have worked for them.

Also wild that few shot prompting leads to worse results in reasoning models. OpenAI hinted at that as well, but it's always just a sentence or two, no benchmarks or specific examples.

#visit	11564224
#session	45464
#live-session	0