[flagged]
I manage a component of an internal compute product which serves ~a billion idempotent use-cases per quarter and I can confidently tell you that you're incorrect.
What I haven't been able to teach AI is the full distributed nature of the system, how we progressively roll out each service (about ~30 unique ones) when we push updates -- and how to read, write, and review my code while keeping all of this in-context (because believe me, if it's not in-context, it is useless to me). Don't get me started on all the containers, K8s configs, endpoint naming conventions...
My entire stack covers bare metal, virtualisation infrastructure, storage infrastructure... I could go on. At a certain scale, it doesn't matter how fast you write something, but if what you're writing is bulletproof.
You literally just proved my point.
Or, if we consider the fact that an LLM’s performance depends on the task’s similarity to others in the training set, it could be that one person is doing a fairly novel task and another is doing something very well represented in online code.