Hacker News new | past | comments | ask | show | jobs | submit
Really interesting approach. Curious how the 2-bit quantization affects the model's reasoning ability on longer chains of thought vs shorter prompts. The benchmarkslook solid but real-world usage seems like a different story based on the comments here.