Hacker News new | past | comments | ask | show | jobs | submit
Kind of insane how a severely limited company founded 1 year ago competes with the infinite budget of Open AI

Their parent hedge fund company isn't huge either, just 160 employees and $7b AUM according to Wikipedia. If that was a US hedge fund it would be the #180 largest in terms of AUM, so not small but nothing crazy either

The nature of software that has not moat built into it. Which is fantastic for the world, as long as some companies are willing to pay the premium involved in paving the way. But man, what a daunting prospect for developers and investors.
I'm not sure we should call it "fantastic"

The negative downsides begin at "dystopia worse than 1984 ever imagined" and get worse from there

That dystopia is far more likely in a world where the moat is so large that a single company can control all the llms.
That dystopia will come from an autocratic one party government with deeply entrenched interests in the tech oligarchy, not from really slick AI models.
The way it is going, we are all going be busy with WW3 soon so we won’t have much time to worry about that.
loading story #42774438
The most is there I think: capital to train models and buy good data, and then pull strings to make it into everyone's computer.

It's indeed very dystopia.

loading story #42773811
They’re probably training on outputs of existing models.
loading story #42772928
This is the reason I believe the new AI chip restriction that was just put in place will backfire.
loading story #42768839
loading story #42773011
loading story #42774741
It's pretty clear, because OpenAI has no clue what they are doing. If I was the CEO of OpenAI, I would have invested significantly in catastrophic forgetting mitigations and built a model capable of continual learning.

If you have a model that can learn as you go, then the concept of accuracy on a static benchmark would become meaningless, since a perfect continual learning model would memorize all the answers within a few passes and always achieve a 100% score on every question. The only relevant metrics would be sample efficiency and time to convergence. i.e. how quickly does the system learn?

loading story #42770540
loading story #42769767
loading story #42773863
[flagged]
loading story #42768604
loading story #42769718
loading story #42768749
Except it’s not really a fair comparison, since DeepSeek is able to take advantage of a lot of the research pioneered by those companies with infinite budgets who have been researching this stuff in some cases for decades now.

The key insight is that those building foundational models and original research are always first, and then models like DeepSeek always appear 6 to 12 months later. This latest move towards reasoning models is a perfect example.

Or perhaps DeepSeek is also doing all their own original research and it’s just coincidence they end up with something similar yet always a little bit behind.

loading story #42768801
loading story #42768824
loading story #42768951
loading story #42768814
loading story #42768871
loading story #42769732
loading story #42768780