We just finished our initial coding evals of Opus 4.8. Anthropic definitely heard the backlash from Opus 4.7 and they made up for it today.
Subjectively, it's also quite enjoyable to use (although it feels a bit slower on max reasoning), and it's the first Anthropic model that can implement a complex feature without Codex finding 100 bugs.
Data at https://gertlabs.com/rankings