Story Detail of id 47471746 | Liveview Hacker News

mmoustafa23 hours ago | on: Tinybox – A powerful computer for deep learning

I would love to see real-life tokens/sec values advertised for one or various specific open source models.

I'm currently shopping for offline hardware and it is very hard to estimate the performance I will get before dropping $12K, and would love to have a baseline that I can at least always get e.g. 40 tok/s running GPT-OSS-120B using Ollama on Ubuntu out of the box.

atwrk8 hours ago | parent | next

For reference, 12k gets you at least 4 Strix Halo boxes each running GPT-OSS-120B at ~50tok/s.

hpcjoe22 hours ago | parent

Look for llmfit on github. This will help with that analysis. I've found it reasonably accurate. If you have Ollama already installed, it can download the relevant models directly.

#visit	13,229,190
#session	74,665
#live-session	0