DeepSeek-R1
https://github.com/deepseek-ai/DeepSeek-R1Their parent hedge fund company isn't huge either, just 160 employees and $7b AUM according to Wikipedia. If that was a US hedge fund it would be the #180 largest in terms of AUM, so not small but nothing crazy either
- function calling is broken (responding with excessive number of duplicated FC, halucinated names and parameters)
- response quality is poor (my use case is code generation)
- support is not responding
I will give a try to the reasoning model, but my expectations are low.
ps. the positive side of this is that apparently it removed some traffic from anthropic APIs, and latency for sonnet/haikku improved significantly.
There are various ways to run it with lower vram if you're ok with way worse latency & throughput
Edit: sorry this is for v3, the distilled models can be ran on consumer-grade GPUs
- UI Generation: The generated UI failed to function due to errors in the JavaScript, and the overall user experience was poor.
- Gitlab Postgres Schema Analysis: It identified only a few design patterns.
I am not sure if these are suitable tasks for R1. I will try larger variant as well.
1. https://shekhargulati.com/2025/01/19/how-good-are-llms-at-ge... 2. https://shekhargulati.com/2025/01/14/can-openai-o1-model-ana...