Hacker News new | past | comments | ask | show | jobs | submit
Today I was a few hours into chasing down a very tricky timing-dependent bug with GPT 5.5 and we were starting to go into circles. I noticed Opus 4.8 had showed up in GitHub Copilot so I switched over and pointed it at my notes so far. Another hour of steady progress and it tracked it down to some missing synchronisation in an upstream library which was occasionally corrupting a linked list. N=1 but worth every one of those rather expensive 15x requests today. 15x... yeah.
That is interesting, are you saying that GPT 5.5 could not fix an issue that Opus 4.8 did? Are you sure this is not due to fresh context?

I do notice this tendency for 5.5 to go in endless circles.

That's my initial experience, yes. It's hard to compare these things cleanly of course. I went through several new contexts on GPT and it just couldn't get traction -- it became hard to keep it focused on "yes there's clearly a race but what actual persistent state got broken"? It just wanted to change the thread priorities so that the problem didn't occur and kept doubling down on that as the solution. Opus made some missteps too but it responded well to my corrections - 2 or 3 significant ones along the way - and it was prepared to keep digging on my exact goal until it found the real issue.
loading story #48320241
GPT 5.5 feels worse than 5.4 for the last few weeks. Again N=1, but would be interested to see how opus 4.8 and gpt 5.4 match
loading story #48320924