Chain of thought is like trying to improve JPG quality by re-compressing it several times. If it's not there it's not there.
>It's not thinking
>it compressed the internet into a clever, lossy format with nice interface and it retrieves stuff from there.
Humans do both, why can't LLM's? >Chain of thought is like trying to improve JPG quality by re-compressing it several times. If it's not there it's not there.
More like pulling out a deep-fried meme, looking for context, then searching google images until you find the most "original" JPG representation with the least amount of artifacts.There is more data to add confidently, it just has to re-think about it with a renewed perspective, and an abstracted-away higher-level context/attention mechanism.
Empirically speaking, I have a set of evals with an objective pass/fail result and a prompt. I'm doing codegen, so I'm using syntax linting, tests passing, etc. to determine success. With chain-of-thought included in the prompting, the evals pass at a significantly higher rate. A lot of research has been done demonstrating the same in various domains.
If chain-of-thought can't improve quality, how do you explain the empirical results which appear to contradict you?