Story Detail of id 48394339 | Liveview Hacker News

dirkg18 hours ago | on: Gemma 4 12B: A unified, encoder-free multimodal model

> For 16GB laptops, Qwen 3.5 9B is the undisputed champ.

you can run qwen 3.6 35BA3B on a 12-16GB vram gpu and ot works pretty well.

https://www.youtube.com/watch?v=8F_5pdcD3HY&t=1s

even the 27B in some quants can fit.

https://www.reddit.com/r/LocalLLaMA/comments/1tkmgwj/qwen27b...

qwen IMO is far better for coding, esp agentic coding when combined with something like Pi, it comes probably close enough to Sonnet for a lot of use cases.

Gemma family is better for almost all other tasks you'd use a local llm for.

ricardobayes9 hours ago | parent | next

You can run it, however those low quantized models (iQ2, iQ4, Q2) will very likely underperform the 9B versions at Q6/Q8.

kanemcgrath1 hour ago | root | parent

Something about qwen models hold up really well even at low quants. for most other models anything under q5 is cooked, but on 35B-A3B I can get a lot of things done even at q3_xl. It is definitely better than full precision 9B

selicos6 hours ago | parent | next

I want to try a hybrid setup of Gemma 4 E4B with lots of context for general, then Qwen 3.5 9B or larger for coding. Strix Halo set up this weekend, which may enable even larger Qwen models with tons of context.

dofm8 hours ago | parent

The larger Gemma models are quite good at PHP. I would not be surprised if that was a training objective — it's one of the more consumer-focussed programming languages. They have very good knowledge of wordpress hooks.

#visit	13,571,747
#session	74,665
#live-session	0