Hacker News new | past | comments | ask | show | jobs | submit
> We’ve therefore launched the model with safeguards that mean queries on some topics will instead receive a response from our next-most-capable model, Claude Opus 4.8. To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions. With more capable models arriving in the coming months...

This sounds suspiciously like a capacity story masquerading as a safety story.

Approx. 5% sessions? That's insanely high.