Story Detail of id 48443622 | Liveview Hacker News

This is literally what talaas has done with chatjimmy.ai.

Try it, it's llama 3.1 8B at 16000 tokens per second.

Wow that incredibly fast. I like this outcome more than centralized datacenters.