DeepSeek reveals AI model training cost far lower than US rivals The Mainstream

Chinese artificial intelligence firm DeepSeek has disclosed the cost of training its R1 model, a figure that is far below the amounts reported by major US technology companies. The Hangzhou-based company said in a peer-reviewed paper published in Nature that it spent just 294,000 dollars to train the reasoning-focused model, a disclosure that has reignited debate over the soaring costs of AI development.

DeepSeek’s R1 model, launched earlier this year, gained global attention not only for its cost efficiency but also because it was made open source, allowing unlimited free use. Its release was viewed as a direct challenge to AI models from OpenAI, Google and Anthropic, and it triggered concerns among US investors that cheaper alternatives could disrupt the dominance of American technology leaders. In January, the company’s announcement led to a sell-off in some tech stocks, particularly those linked to AI hardware providers.

The newly published estimate marks the first time DeepSeek has provided a clear breakdown of its training expenses. According to the paper, the R1 model was trained using 512 Nvidia H800 chips. This contrasts sharply with previous comments from OpenAI CEO Sam Altman, who said in 2023 that training a foundational model had cost his firm “much more” than 100 million dollars, though no exact figure was disclosed.

The disclosure has also brought renewed focus to DeepSeek’s access to advanced chips. The company acknowledged in supplementary documents that it had used some Nvidia A100 chips in the early stages of development, despite US export restrictions imposed in October 2022 that barred the sale of A100 and the more powerful H100 chips to China. DeepSeek stated that its main training relied on H800 chips, which are not covered by the ban.

US officials have previously alleged that DeepSeek managed to secure “large volumes” of H100 chips despite restrictions, a claim Nvidia has denied. The chipmaker has maintained that DeepSeek only used lawfully obtained H800 processors.

The publication of the R1 model’s costs could reshape discussions in the global AI industry, raising questions about why US companies spend hundreds of millions of dollars on training large language models while a Chinese competitor claims to achieve results at a fraction of the cost.

Also read: Viksit Workforce for a Viksit Bharat

Do Follow: The Mainstream formerly known as CIO News LinkedIn Account | The Mainstream formerly known as CIO News Facebook | The Mainstream formerly known as CIO News Youtube | The Mainstream formerly known as CIO News Twitter |The Mainstream formerly known as CIO News Whatsapp Channel | The Mainstream formerly known as CIO News Instagram

About us:

The Mainstream formerly known as CIO News is a premier platform dedicated to delivering latest news, updates, and insights from the tech industry. With its strong foundation of intellectual property and thought leadership, the platform is well-positioned to stay ahead of the curve and lead conversations about how technology shapes our world. From its early days as CIO News to its rebranding as The Mainstream on November 28, 2024, it has been expanding its global reach, targeting key markets in the Middle East & Africa, ASEAN, the USA, and the UK. The Mainstream is a vision to put technology at the center of every conversation, inspiring professionals and organizations to embrace the future of tech.

Top 5 This Week

WhatsApp set to introduce new privacy feature for iPhone users

Steve Sloan joins JenaValve Technology as Chief Financial Officer

Shawn Edwards joins Tenant Inc. as Chief Financial Officer

ULOOK Technologies raises ₹19 crore to advance indigenous space intelligence capabilities

Nagpur cybercrime police arrests Pokhran youth in multi-crore online fraud case

Related News

WhatsApp set to introduce new privacy feature for iPhone users

Anthropic to expand European presence with new offices in Paris and Munich

FAA limits commercial rocket launches to nighttime hours due to shutdown related air traffic concerns

Nvidia CEO Jensen Huang says company not planning AI chip shipments to China amid trade standoff

Meta to invest $600 billion in US infrastructure and AI data centres over the next three years

Google introduces AI-powered tools to transform learning in India

DeepSeek reveals AI model training cost far lower than US rivals

US officials have previously alleged that DeepSeek managed to secure “large volumes” of H100 chips despite restrictions, a claim Nvidia has denied. The chipmaker has maintained that DeepSeek only used lawfully obtained H800 processors.

LEAVE A REPLY Cancel reply

Popular Articles

WhatsApp set to introduce new privacy feature for iPhone users

Steve Sloan joins JenaValve Technology as Chief Financial Officer

Shawn Edwards joins Tenant Inc. as Chief Financial Officer

ULOOK Technologies raises ₹19 crore to advance indigenous space intelligence capabilities

Nagpur cybercrime police arrests Pokhran youth in multi-crore online fraud case

Latest Articles

StarRez launches global innovation hub in Hyderabad to drive next-generation SaaS and AI innovation

WeWork India sees rising demand as GCC expansion and startups fuel growth in flexible offices

DigiKey launches Indian subsidiary and global capability centre in Bengaluru

Most Popular

The Rise of Edge AI: Where Data Meets Intelligence

How Artificial Intelligence is Revolutionizing Everyday Life

The Rise of Digital Twins: How Virtual Worlds Are Powering Real Innovation

Subscribe

Subscribe to newsletter

Subscribe to newsletter

Top 5 This Week

Related News

DeepSeek reveals AI model training cost far lower than US rivals

US officials have previously alleged that DeepSeek managed to secure “large volumes” of H100 chips despite restrictions, a claim Nvidia has denied. The chipmaker has maintained that DeepSeek only used lawfully obtained H800 processors.

LEAVE A REPLY Cancel reply

Popular Articles

Latest Articles

Most Popular

Subscribe