Bernstein Research‘s recent report has tempered the hype surrounding DeepSeek, the Chinese startup whose low-cost AI assistant has caused a stir in the tech world and even impacted global markets. While acknowledging DeepSeek‘s impressive capabilities, the report argues that claims of the AI being built for just $5 million are misleading and don’t represent the full picture of its development costs.
The Bernstein report directly addresses the narrative surrounding DeepSeek’s supposed budget: “We believe that DeepSeek DID NOT ‘build OpenAI for USD 5M’; the models look fantastic but we don’t think they are miracles; and the resulting Twitter-verse panic over the weekend seems overblown.” The report emphasizes that the “fantastic” results achieved by DeepSeek shouldn’t be mistaken for miraculous, and the market reaction, particularly the impact on Nvidia, was disproportionate to the reality of the situation.
What DeepSeek is not telling
DeepSeek has developed two primary AI model families: DeepSeek-V3 and DeepSeek R1. The V3 model is a large language model (LLM) utilizing a Mixture-of-Experts (MOE) architecture. Bernstein’s analysis reveals that training the V3 model required significant computational resources. DeepSeek employed a cluster of 2,048 NVIDIA H800 GPUs for approximately two months. This translates to roughly 2.7 million GPU hours for pre-training and a total of 2.8 million GPU hours when including post-training. GPU hours represent the cumulative time a graphics processing unit is actively engaged in processing tasks, highlighting the substantial investment in computational power.
The report underscores that the widely cited “$5 million” figure fails to account for the extensive research, development, experimentation, personnel costs, and other associated expenses inherent in developing such a complex model. It’s not simply a matter of hardware costs; the figure omits the significant investment in talent and time required for AI research and development.
The unveiling of DeepSeek’s assistant had a dramatic effect on the market, particularly on AI chip leader Nvidia. On January 27th, Nvidia experienced a record single-day loss in market value, close to $593 billion, driven by fears that Chinese AI models like DeepSeek’s could gain a competitive edge over more expensive American counterparts. Bernstein’s report suggests that this market reaction was an overreaction.
While the V3 model’s training resources were detailed, the report notes that the R1 model, which was considered particularly impressive, lacked similar quantification in DeepSeek’s research paper. This raises questions about the true cost and resources involved in developing the R1 model.
In conclusion, Bernstein acknowledges DeepSeek’s achievements as “fantastic” but argues that the claim of building an OpenAI-level competitor for $5 million is “exaggerated” and “overblown.” The report provides a more nuanced perspective, highlighting the substantial computational resources and likely hidden costs involved in developing advanced AI models like DeepSeek’s, and suggesting that the market’s reaction to the company’s emergence may have been disproportionate.