Hacker News new | past | comments | ask | show | jobs | submit
> you need enormous VRAM laden farms of GPUs to do inference on a model like Opus 4.6.

It's probably a trade secret, but what's the actual per-user resource requirement to run the model?

{"deleted":true,"id":47440564,"parent":47440347,"time":1773932097,"type":"comment"}