This isn't hard to understand
"Model is nerfed" claim hits social media
Someone else sees it, frequency bias makes them think their model is also nerfed, and they amplify the claim
Now it spreads, like a virus, even if the model never changed
Social dynamics like this are well understood psychologically
> If the LLM used to be able to achieve set goals and no longer could, it is already a sign of the distribution shift.
The more likely explanation is that you're looking at older LLMs with rose tinted glasses, and misremembering what it could achieve
Otherwise you could measure the token shift and see the better tps and latency
Your own evals would trend down
But no one, not one person, has presented empirical evidence of being served a quant. Just vibes.