Quickly deployed it to check some benchmarks relevant for German language. These are results for CohereLabs/include-base-44 german only : Gemma 4 12B %61.9
Gemma 4 26B (a4b MoE) 0.647
Qwen 3 14B 0.621
Gemma 4 12B 0.618
Ministral 14B 2512 0.604
Gemma 3 12B 0.547
The quwen 3 14B vs Gemma 4 12B difference is within random variance they same in some repeat runs they actually got the exact same score. Next step up Gemma 4 31B gets 0.676 on this. Or let in some reasoning Qwen 3 14B (reasoning) 0.676.I'll run some cheat-proof benchmarks ones tomorrow see if qwen is still on top.
I just ran a short tool use test and it's doing pretty well.