Javascript is not enabled. This site can still works but it'll be more interactive when javascript is enabled.
loading...
Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
ryeguy
3 hours ago
|
on: Claude Fable 5
Did you read the blog post? They compare to deepswe and call it out as the worst one for false positives (failed, but the benchmark assessed it as correct). It also has less language variance.
reply