Hacker News new | past | comments | ask | show | jobs | submit
Not discussing Mythos here, but Opus. Opus to me has been significantly better at SWE than GPT or Gemini - that gets me confused why Opus is ranking clearly lower than GPT, and even lower than Gemini.
When did you last compare them? Codex right now is considerably better in my experience. Can't speak for Gemini.
loading story #47683584
loading story #47688563
loading story #47686623
A secret art known to the cognoscenti as "benchmark gaming".