Story Detail of id 48372596 | Liveview Hacker News

827a5 hours ago | on: Expanding Project Glasswing

I believe the correct way to interpret AISI’s findings is that both Mythos and 5.5-Cyber are capable of solving their full benchmark (the only two models that can); Mythos does it with fewer tokens and more consistently.

Two things of note: 5.5-Cyber is likely to be substantially cheaper than Mythos, given it is priced around Opus. Additionally: AISI has never tested OpenAI’s best public model and actual Mythos competitor: 5.5-Pro.

#visit	13,532,149
#session	74,665
#live-session	0