Story Detail of id 48313956 | Liveview Hacker News

jonnycoder20 hours ago | on: Claude Opus 4.8

No, no it's been pretty easy with software engineering. I work on two types of projects and it's very easy to ask claude for a plan, then have gpt 5.5 rip it to shreds and find legit issues, and vice versa. If both 5.5 and claude 4.8 can independently create a plan and both find no critical or high issues, then we will be at that point.

replwoacause11 hours ago | parent | next

I wouldn't say vice-versa is true. GPT 5.5 routinely finds major mistakes made by Opus 4.7, but I've yet to have it work the other way around.

elcritch15 hours ago | parent

Additionally running GPT-5.5 on medium sometimes gives me better results than high mode. On any of them I still have to push the models in the right direction.

#visit	13,436,851
#session	74,665
#live-session	0