Hacker News new | past | comments | ask | show | jobs | submit

Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks

https://github.com/antoinezambelli/forge
loading story #48200359
loading story #48204556
loading story #48198634
loading story #48201888
> One thing I really didn't expect: the serving backend matters. Same Mistral-Nemo 12B weights produce 7% accuracy on llama-server with native function calling and 83% on Llamafile in prompt mode.

I thought Llamafile was just a model and llama.cpp bundled in to a single binary - is this the difference between Llamafile injecting a default sysmtem prompt vs hitting the raw llama-server endpoint with no harness?

That seems like comparing apples to apple pie, there's some ingredients missing.

I was surprised as well. I did go with an extreme (but true) example in the post. In this case, native function-calling template likely is in play.

However, that doesn't explain the Lamaserver prompt vs llamafile at ~ +4pts, or vs Ollama (at ~ +30ish pts) that sits almost perfectly between llamaserver native and llamafile.

The backend affects almost all model families, and was just something I've never seen really talked about.

loading story #48202023
I wouldn't expect such difference
loading story #48204391
loading story #48203439
loading story #48199948
loading story #48208442
loading story #48204464
loading story #48200762
loading story #48199142
loading story #48198479
loading story #48200124
loading story #48201421
loading story #48200234
loading story #48208854
loading story #48208866
loading story #48206436
loading story #48200304
loading story #48198514
loading story #48200832
loading story #48203145
loading story #48203596
loading story #48210301
loading story #48203782
loading story #48201768
loading story #48205229
loading story #48200090
loading story #48203298
loading story #48200330
loading story #48197802
loading story #48199806
loading story #48200002
loading story #48198875
loading story #48199115
loading story #48202590
loading story #48204493
loading story #48203774
loading story #48200286
loading story #48203974
loading story #48198706
loading story #48249395
loading story #48203854
loading story #48203387
loading story #48211054
loading story #48200953
loading story #48203968
loading story #48205781
loading story #48204487
loading story #48217211
loading story #48202361
loading story #48201622
loading story #48202227
loading story #48201422
loading story #48203742
loading story #48202380
loading story #48203121
loading story #48206492
loading story #48205302
loading story #48202111
loading story #48250938
loading story #48238617
loading story #48238932
loading story #48204259
loading story #48209380
loading story #48235392
loading story #48203590
loading story #48202167
loading story #48213619
loading story #48223509
loading story #48211145
loading story #48206044
loading story #48221239
loading story #48209894
loading story #48205733
loading story #48209241
loading story #48204515
loading story #48204143
loading story #48209974
loading story #48222279
loading story #48222249
loading story #48206574
loading story #48217110
loading story #48201036
loading story #48199417