My experience has been that raw “one-shot intelligence” hasn’t improved as dramatically in the last year, but the workflow around the models has improved massively.
When you combine models with:
tool use
planning loops
agents that break tasks into smaller pieces
persistent context / repos
the practical capability jump is huge.