search icon

Deepseek is not a “side project”. At the same time employees are not lying when they say it is. The story they are telling is myth making in the same vein in the Silicon Valley “we want to make the world a better place” but at the same time make billions of dollars. The team obviously had - access to more than ~10k GPUs - according to Scale AI CEO ~50k - was hiring only from top 3 universities from China meaning competitive with Alibaba and Tencent These two facts alone mean that they were clearly commercially successful and well known enough to get access to both those resources Deepseek feels more to me like skunkworks, perhaps a necessary one as the core quant business became less feasible regulatorily. It’s like Lockheed setting up a separate small team to compete with SpaceX because the main United Launch Alliance was not going to work out. It’s also very hard to track costs in China, because the regional governments absorb so much costs. - Early Bitcoin miners had free power because governments built power plants to nowhere and miners were willing to site next to them - Alibaba was able to get regional govt to absorb warehouse construction costs on their balance sheets rather than directly pay for it, and looked extremely asset light and softwarey when it went public Perfectly possible for most of the costs to be parked on a balance sheet outside the core business, perhaps as some form of tech data center construction incentive. Also possible no one except the founder knows all the financial arrangements. Some of these can be absolutely insane handshake deals which get resolved by reputation alone so 🤷‍♀️ This much is clear: - the model is really really good, on par with OpenAI release from 2 months ago - having said that unreleased models from OpenAI and Anthropic are (probably) better - the research agenda is still being set by the US firms, this model was a fast follow on the o1 release - they are working very fast as they are catching up sooner than expected - they are not copying or cheating, this isn’t industrial espionage. At most it is reverse engineering - they are largely developing their own talent, not reliant on US trained PhD’s - they are less constrained than the American firms by IP licensing, privacy, safety, political concerns around wrongly ingesting data from people who don’t want to be trained on. Fewer lawsuits, lawyers and less caution - they also seem to be over the “Tiananmen Square” issue. The model can say it, even if the Deepseek website doesn’t Of these the most significant upgrade is that they are able to develop talent internally without relying on US trained PhDs. That expands the pool significantly. What happens next ?

0/200

评论 0

暂无更多评论