Hacker News new | past | comments | ask | show | jobs | submit
Unless I misunderstood it seems like this is trailing the pareto frontier in cost and speed.

Compare to providers like Fireworks and even with the openrouter 5% charge it's not competitive

our SLA is actually higher and we are lower priced. We are also using this as a step into serving finetuned models for much cheaper than Fireworks/Together and not having the horrible cold starts of Modal. We're essentially trying to prove that our engine can hang with the best providers while multiplexing models.