Hacker News new | past | comments | ask | show | jobs | submit
That's 4 32GB GPUs with 600GB/s bw each. This model is not running on that scale GPUs. I think something like 96GB RTX PRO 6000 Blackwells would be the minimum to run a model of this size with performance in the range of subscription models.
> I think something like 96GB RTX PRO 6000 Blackwells would be the minimum to run a model of this size with performance in the range of subscription models.

GLM 5.1 has 754B parameters tho. And you still need RAM for context too. You'll want much more than 96GB ram.