Story Detail of id 48312774 | Liveview Hacker News

On my tests[0] it does a bit worse, and it's almost 2x expensive than Opus 4.7...

I was surprised to see that it failed a Data extraction test (it gets it right 2/3 times, but one time it randomly returns null for a value instead).

It makes sense a bit that it fails more Trivia/Domain-specific knowledge tasks (I think models are more and more trained towards agentic use-case than general intelligence).

[0]: https://aibenchy.com/compare/anthropic-claude-opus-4-7-mediu...

XCSme21 hours ago | parent | next

For some reason everything is 2x (2x cost, 2x avg response time, 2x reasoning and output tokens)...

Double-checking my test harness, but it's the first model that does this, so I doubt the issue is on my side...

EDIT: Harness seems correct, for straight coding tasks they perform identical: https://i.snipboard.io/5xbpzY.jpg

dwaltrip21 hours ago | parent | next

Wait, doesn’t the blog post say the price is the same as 4.7?

> Claude Opus 4.8 is available everywhere today. Pricing for regular usage is unchanged from Opus 4.7: $5 per million input tokens and $25 per million output tokens. Pricing for fast mode is $10 per million input tokens and $50 per million output tokens.

Where do you see the 2x cost?

XCSme21 hours ago | root | parent | next

The total cost of running my benchmarks, was 1.6x higher compared to Opus 4.7, mostly because of 2x output tokens:

https://i.snipboard.io/vrdwTa.jpg

dwaltrip20 hours ago | root | parent

ah ok, thanks for clarifying!

spprashant21 hours ago | root | parent | next

If it spends 2x tokens to achieve the same result, that's effective 2x cost in a manner of speaking

21 hours ago | root | parent | next

{"deleted":true,"id":48313068,"parent":48312958,"time":1779991855,"type":"comment"}

21 hours ago | root | parent

{"deleted":true,"id":48313096,"parent":48312958,"time":1779991982,"type":"comment"}

SupLockDef21 hours ago | parent

Releasing a new model is the new way to Jack up the price hehe.

eshack9419 hours ago | root | parent

That's exactly right.

#visit	13,436,814
#session	74,665
#live-session	0