Google Launches VaultGemma, World’s Largest Privacy-Focused AI Model The Mainstream

Google has introduced VaultGemma, a new AI model designed with privacy-preserving techniques to keep training data confidential.

VaultGemma is a small language model (SLM) with one billion parameters and is described as the largest model ever trained with differential privacy (DP). It has been developed by Google researchers in collaboration with its DeepMind AI unit, using a new set of scaling laws.

The model weights are available for free on Hugging Face and Kaggle. In a blog post on September 12, Google said, “VaultGemma represents a significant step forward in the journey toward building AI that is both powerful and private by design. By developing and applying a new, robust understanding of the scaling laws for DP, we have successfully trained and released the largest open, DP-trained language model to date.”

Focus on Privacy and Data Protection

Data privacy continues to be a major challenge in AI development. Large language models like ChatGPT and Gemini have raised concerns about exposing user data, as personal information can sometimes be retrieved from them through carefully designed prompts. For example, in an ongoing lawsuit against OpenAI, a media organisation alleged that ChatGPT reproduced its articles word for word.

Unlike traditional fine-tuning approaches with user-level protections, Google said VaultGemma integrates differential privacy at the pre-training stage. By introducing calibrated noise, the model avoids memorising or reproducing its training data.

Balancing Trade-offs

Applying differential privacy to large models is complex and comes with challenges such as reduced training stability, larger batch size requirements, and higher computational costs. To address this, Google developed new scaling laws that guide training configurations while balancing computing needs, privacy, and performance.

According to Google, “A key finding is that one should train a much smaller model with a much larger batch size than would be used without DP.”

Performance Benchmarks

VaultGemma achieved performance levels comparable to an older GPT-2 model of similar size across several standard academic benchmarks, including HellaSwag, BoolQ, PIQA, SocialIQA, TriviaQA, ARC-C, and ARC-E.

In privacy testing, Google prompted VaultGemma with partial text from training documents. The model did not reproduce the corresponding text, though Google noted that if several training sequences included related information, the model could still generate that fact.

The company emphasised that further research on differential privacy is required to narrow the gap between DP-trained models and non-DP-trained models in terms of utility.

Also read: Viksit Workforce for a Viksit Bharat

Do Follow: The Mainstream formerly known as CIO News LinkedIn Account | The Mainstream formerly known as CIO News Facebook | The Mainstream formerly known as CIO News Youtube | The Mainstream formerly known as CIO News Twitter |The Mainstream formerly known as CIO News Whatsapp Channel | The Mainstream formerly known as CIO News Instagram

About us:

The Mainstream formerly known as CIO News is a premier platform dedicated to delivering latest news, updates, and insights from the tech industry. With its strong foundation of intellectual property and thought leadership, the platform is well-positioned to stay ahead of the curve and lead conversations about how technology shapes our world. From its early days as CIO News to its rebranding as The Mainstream on November 28, 2024, it has been expanding its global reach, targeting key markets in the Middle East & Africa, ASEAN, the USA, and the UK. The Mainstream is a vision to put technology at the center of every conversation, inspiring professionals and organizations to embrace the future of tech.

Top 5 This Week

Elderly woman loses ₹52.78 lakh in Ghaziabad ‘digital arrest’ cyber fraud

IIT Madras launches ‘Startups for All’ to make startup information accessible to everyone

Nvidia and Deutsche Telekom to build €1 billion AI cloud centre in Germany

Intuitive.ai unveils aiE framework to drive enterprise-scale AI transformation

Cognizant partners with Anthropic to scale AI adoption across enterprises

Related News

Cognizant partners with Anthropic to scale AI adoption across enterprises

Google expands Chrome’s autofill feature to include IDs and vehicle details

Apple leads India’s premium smartphone market as overall value hits record high

OpenAI limits ChatGPT from giving medical, legal and financial advice to reduce liability risks

Apple launches new web-based App Store with redesigned interface and cross-device browsing

Apple TV debuts new intro sound composed by Finneas after rebrand

Google Launches VaultGemma, World’s Largest Privacy-Focused AI Model

According to Google, “A key finding is that one should train a much smaller model with a much larger batch size than would be used without DP.”

LEAVE A REPLY Cancel reply

Popular Articles

Elderly woman loses ₹52.78 lakh in Ghaziabad ‘digital arrest’ cyber fraud

IIT Madras launches ‘Startups for All’ to make startup information accessible to everyone

Nvidia and Deutsche Telekom to build €1 billion AI cloud centre in Germany

Intuitive.ai unveils aiE framework to drive enterprise-scale AI transformation

Cognizant partners with Anthropic to scale AI adoption across enterprises

Latest Articles

GoodWorks Group launches sustainable GCC-focused tech park in Bengaluru

Vanguard sets up major global technology centre in Hyderabad

Aeries Technology highlights AI-driven execution to accelerate enterprise transformation

Most Popular

The Rise of Edge AI: Where Data Meets Intelligence

How Artificial Intelligence is Revolutionizing Everyday Life

The Rise of Digital Twins: How Virtual Worlds Are Powering Real Innovation

Subscribe

Subscribe to newsletter

Subscribe to newsletter

Top 5 This Week

Related News

Google Launches VaultGemma, World’s Largest Privacy-Focused AI Model

According to Google, “A key finding is that one should train a much smaller model with a much larger batch size than would be used without DP.”

LEAVE A REPLY Cancel reply

Popular Articles

Latest Articles

Most Popular

Subscribe