Story Detail of id 47688005 | Liveview Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

jaggs6 hours ago | on: GLM-5.1: Towards Long-Horizon Tasks

It's a great benchmark. Don't listen to the haters. This one is especially interesting.

https://aibenchy.com/compare/anthropic-claude-sonnet-4-6-med...