Chinese artificial intelligence firm DeepSeek has disclosed the cost of training its R1 model, a figure that is far below the amounts reported by major US technology companies. The Hangzhou-based company said in a peer-reviewed paper published in Nature that it spent just 294,000 dollars to train the reasoning-focused model, a disclosure that has reignited debate over the soaring costs of AI development.
DeepSeek’s R1 model, launched earlier this year, gained global attention not only for its cost efficiency but also because it was made open source, allowing unlimited free use. Its release was viewed as a direct challenge to AI models from OpenAI, Google and Anthropic, and it triggered concerns among US investors that cheaper alternatives could disrupt the dominance of American technology leaders. In January, the company’s announcement led to a sell-off in some tech stocks, particularly those linked to AI hardware providers.
The newly published estimate marks the first time DeepSeek has provided a clear breakdown of its training expenses. According to the paper, the R1 model was trained using 512 Nvidia H800 chips. This contrasts sharply with previous comments from OpenAI CEO Sam Altman, who said in 2023 that training a foundational model had cost his firm “much more” than 100 million dollars, though no exact figure was disclosed.
The disclosure has also brought renewed focus to DeepSeek’s access to advanced chips. The company acknowledged in supplementary documents that it had used some Nvidia A100 chips in the early stages of development, despite US export restrictions imposed in October 2022 that barred the sale of A100 and the more powerful H100 chips to China. DeepSeek stated that its main training relied on H800 chips, which are not covered by the ban.
US officials have previously alleged that DeepSeek managed to secure “large volumes” of H100 chips despite restrictions, a claim Nvidia has denied. The chipmaker has maintained that DeepSeek only used lawfully obtained H800 processors.
The publication of the R1 model’s costs could reshape discussions in the global AI industry, raising questions about why US companies spend hundreds of millions of dollars on training large language models while a Chinese competitor claims to achieve results at a fraction of the cost.
Also read:Â Viksit Workforce for a Viksit Bharat
Do Follow: The Mainstream formerly known as CIO News LinkedIn Account | The Mainstream formerly known as CIO News Facebook | The Mainstream formerly known as CIO News Youtube | The Mainstream formerly known as CIO News Twitter |The Mainstream formerly known as CIO News Whatsapp Channel | The Mainstream formerly known as CIO News Instagram
About us:
The Mainstream formerly known as CIO News is a premier platform dedicated to delivering latest news, updates, and insights from the tech industry. With its strong foundation of intellectual property and thought leadership, the platform is well-positioned to stay ahead of the curve and lead conversations about how technology shapes our world. From its early days as CIO News to its rebranding as The Mainstream on November 28, 2024, it has been expanding its global reach, targeting key markets in the Middle East & Africa, ASEAN, the USA, and the UK. The Mainstream is a vision to put technology at the center of every conversation, inspiring professionals and organizations to embrace the future of tech.