Monday, February 9, 2026

Top 5 This Week

Related News

Sarvam introduces Bulbul V3 to power natural and scalable Indian language voice AI

Bulbul V3 is setting a new standard for voice technology across India’s diverse languages and accents. Sarvam has announced the release of its most advanced text to speech model yet, designed to deliver natural, expressive and production ready speech for real world use.

Voice has become central to how people in India interact with technology. Gig workers onboard through voice agents without forms or apps, students learn through AI tutors in their native language, banks handle massive customer volumes through voice systems, and gamers engage with characters that speak back in familiar tongues. These use cases demand voices that can handle code mixing, regional accents, names, abbreviations and emotional nuance without errors. Bulbul V3 is designed to meet these needs at scale.

The model was tested in an independent third party blind A B human listening study across 11 Indian languages. The evaluation focused on naturalness, robustness and stability, the three qualities most critical for production speech systems. Bulbul V3 showed high listener preference at 48 kHz and emerged as the most preferred model in 8 kHz telephony conditions. It also delivered low character error rates on challenging inputs such as numerics and code mixed text, while maintaining strong stability with minimal word skips and mispronunciations even in long form and high volume use.

The study involved 50 to 70 annotators per language and generated around 2000 votes per language, adding up to more than 20000 votes from over 500 listeners. Bulbul V3 outperformed several global competitors in telephony scenarios and surpassed others in overall stability. It also introduces a low latency streaming mode for near real time conversations, making it suitable for live and interactive applications.

Bulbul V3 comes with over 35 professional grade voices across 11 Indian languages, with plans to expand to 22 languages. It supports voice cloning for custom and brand specific voices and is available through APIs and a no code dashboard. Builders are being offered unlimited access until 28 February 2026, opening the door for wider experimentation across education, healthcare, gaming and enterprise voice systems.

Also read: Viksit Workforce for a Viksit Bharat

Do Follow: The Mainstream formerly known as CIO News LinkedIn Account | The Mainstream formerly known as CIO News Facebook | The Mainstream formerly known as CIO News Youtube | The Mainstream formerly known as CIO News Twitter

About us:

The Mainstream is a premier platform delivering the latest updates and informed perspectives across the technology business and cyber landscape. Built on research-driven, thought leadership and original intellectual property, The Mainstream also curates summits & conferences that convene decision makers to explore how technology reshapes industries and leadership. With a growing presence in India and globally across the Middle East, Africa, ASEAN, the USA, the UK and Australia, The Mainstream carries a vision to bring the latest happenings and insights to 8.2 billion people and to place technology at the centre of conversation for leaders navigating the future.

Popular Articles