Show HN: Gemma 4 Multimodal Fine-Tuner for Apple Silicon

https://github.com/mattmireles/gemma-tuner-multimodal

181MediaSquirrel | 16 hours ago | 24 | HN

I run whisper large-v3 on an m2 max 96gb and even with just inference the memory gets tight on longer audio, can only imagine what fine-tuning looks like. Does the 64gb vs 96gb make a meaningful difference for gemma 4 fine-tuning or does it just push the oom wall back a bit? Been wanting to try local fine-tuning on apple silicon but the tooling gap has kept me on inference only so far.

loading story #47682913

loading story #47680960

loading story #47683078

loading story #47687221

conception12 hours ago | parent | next

I’m pretty excited about the edge gallery ios app with gemma 4 on it but it seems like they hobbled it, not giving access to intents and you have to write custom plugins for web search, etc. Does anyone have a favorite way to run these usefully? ChatMCP works pretty well but only supports models via api.

craze315 hours ago | parent | next

Nice! I've been wanting to try local audio fine-tuning. Hopefully it works with music vocals too

mandeepj10 hours ago | parent | next

> I had 15,000 hours of audio data

do you really need that much data for fine-tuning?

loading story #47683899

neonstatic10 hours ago | parent | next

Just a heads up, that I found NVIDIA Parakeet to be way better than Whisper - faster, uses less compute, the output is better, and there are more options for the output. I am using parakeet-mlx from the command line. Check it out!

dsabanin16 hours ago | parent | next

Thanks for doing this. Looks interesting, I'm going to check it out soon.

loading story #47680964

yousifa15 hours ago | parent | next

This is super cool, will definitely try it out! Nice work

pivoshenko14 hours ago | parent

nice!

#visit	13,255,156
#session	74,665
#live-session	0