Story Detail of id 42009409 | Liveview Hacker News

danenania1 month ago | on: Chain-of-thought can hurt performance on tasks where thinking makes humans worse

> Chain of thought is like trying to improve JPG quality by re-compressing it several times. If it's not there it's not there.

Empirically speaking, I have a set of evals with an objective pass/fail result and a prompt. I'm doing codegen, so I'm using syntax linting, tests passing, etc. to determine success. With chain-of-thought included in the prompting, the evals pass at a significantly higher rate. A lot of research has been done demonstrating the same in various domains.

If chain-of-thought can't improve quality, how do you explain the empirical results which appear to contradict you?

loading story #42010453

#visit	11157696
#session	45005
#live-session	0