Story Detail of id 48260355 | Liveview Hacker News

regularfry7 hours ago | on: Memory has grown to nearly two-thirds of AI chip component costs

If you can do what you need with qwen3.6-27b, it starts to look really interesting. That model is crazy good for the size, but it's a pain tweaking the params to run it on a 4090 with decent context and decent token speed. A 5090 looks tasty from that point of view, and only more so if you think in terms of the probability of that model being roflstomped by something in the same weight class in the next couple of years. I reckon that probability is significantly non-zero, but fundamentally it's a guess.

gruez6 hours ago | parent

>If you can do what you need with qwen3.6-27b, it starts to look really interesting.

What's the use case here? Churning out massive amounts of slop code through autonomous agents? Running openclaw 24/7? I think the proliferation of codex and claude code, compared to any of the cheaper open models suggests that at least for most software development, the 50-75% discount of open models isn't worth the hassle of the decreased intelligence.

weitendorf4 hours ago | root | parent

I think there is a reasonable basis for taking a gamble that small models capable of fitting on a 32GB card will continue to advance over the next 5 years and eventually approach Gemini Flash 3.5 / Sonnet 4.6 levels of capabilities, which I would consider to be past the threshold of “probably worth the cost and hassle of running 24/7” if the upfront cost of the hardware was palatable.

My use case would primarily be in search, integration, and indexing other software projects with my own, as well as transcription/indexing of interesting video and audio content (eg Dwarkesh interviews) that I don’t have time to watch but want to easily search and apply to my projects, and search/indexing for useful information from things like Linux kernel and security mailing lists. Basically there is a lot of stuff that, if the cost were low enough, I would point a reasonably intelligent AI at to distill out useful information and apply it to my projects, or just cherry pick the interesting things out and surface them to me so I don’t have to wade through all the mundane stuff and man-made slop getting in the way.

gruez4 hours ago | root | parent

>My use case would primarily be in search, integration, and indexing other software projects with my own, as well as transcription/indexing of interesting video and audio content (eg Dwarkesh interviews) that I don’t have time to watch but want to easily search and apply to my projects, and search/indexing for useful information from things like Linux kernel and security mailing lists. Basically there is a lot of stuff that, if the cost were low enough, I would point a reasonably intelligent AI at to distill out useful information and apply it to my projects, or just cherry pick the interesting things out and surface them to me so I don’t have to wade through all the mundane stuff and man-made slop getting in the way.

All of that feels like something that a $20 chatgpt pro subscription is for, maybe with slightly better tool use capabilities. There's no way that a $4000 purchase on a GPU would ever be worth it if all you're doing is running a handful of queries per day.

loading story #48262481

#visit	13,358,557
#session	74,665
#live-session	0