I use small models like Gemma to improve transcriptions from ASR models amongst other micro-tasks. I actually built out a fine-tuning whisper pipeline with all local (smaller) models meaning no cloud/big-tech co is able to train/sell my (private) data.
Repo is https://github.com/Rebreda/listenr - mainly geared toward Whisper fine-tuning, AMD hardware and local inference