Story Detail of id 48394436 | Liveview Hacker News

dirkg16 hours ago | on: Gemma 4 12B: A unified, encoder-free multimodal model

The RTX/DGX Spark, Mac Ultras with 128GB unified ram are all ~$5k. Its still an expensive toy for rich people, it might as well be an H100 for 99.9% of the population (not devs with high paying jobs, of course).

the value of local models is allowing normal people to access AI without needing to subscribe to cloud services. this is esp imp for the rest of the world where even a 12GB gpu is extremely expensive.

there is no real viable local option that will come even close to Sonnet/Gemini Flash or the cheaper chinese models. Even if your pc costs <$2k you are never going to recoup the hw costs, and the results will be far worse.

green7ea8 hours ago | parent | next

I'm using a Strix Halo laptop (~3k, 64GiB) and with Gemma 4 and Qwen 3.6, both at 8 bits, I'm seeing very impressive results.

As a work tool, this is reasonably priced. You can save a bit of money by opting for a non-laptop form factor.

organsnyder7 hours ago | parent

My Framework Desktop with 128GB was about half that. I did luck out by buying before RAM prices went crazy, though.

I'm looking forward to the fallout when the data center bubble bursts. There's a good possibility we'll see a glut of hardware, either on the used market or from manufacturers that no longer have massive orders from OpenAI and the like.

#visit	13,566,904
#session	74,665
#live-session	0