Story Detail of id 48386734 | Liveview Hacker News

philipkglass1 day ago | on: Gemma 4 12B: A unified, encoder-free multimodal model

Since ollama has diverged from llama.cpp, it will take a bit of time for ollama to support multi-modality. If you're using plain llama.cpp it looks like a PR has already merged for this model with vision and audio support:

https://github.com/ggml-org/llama.cpp/pull/24077

zozbot2341 day ago | parent

They've actually gone back to (a lightly patched) llama.cpp with the 0.30 release a few weeks ago, and have now vendored-in an up to date release. Needless to say this is great news for both projects!

#visit	13,567,224
#session	74,665
#live-session	0