Grab, the Singapore-based superapp company, has developed its own artificial intelligence (AI) model to improve performance in understanding Southeast Asian languages. The company said it created an in-house vision large language model (LLM) after finding that both proprietary and open-source systems failed to deliver the required accuracy and efficiency.
The new lightweight AI model is designed to scan documents and extract information, enhancing internal operations across the company. Grab explained that its decision to build the model internally stemmed from persistent issues with existing AI tools, which struggled with regional languages and dialects.
AI models struggle with non-English languages
In a detailed blog post, Grab said it tested several external options before deciding to build its own system. “While powerful proprietary Large Language Models (LLMs) were an option, they often fell short in understanding SEA languages, produced errors, hallucinations, and had high latency. On the other hand, open-sourced Vision LLMs were more efficient but not accurate enough for production,” the company stated.
The challenge faced by Grab reflects a broader issue across the AI industry. Many AI systems continue to perform poorly in non-English languages, despite advancements in multilingual models. Researchers have long pointed out that while major models show some competence in popular languages such as Hindi, Japanese, Spanish, and Chinese, they still fail to grasp deeper linguistic nuances. This limits their usefulness for enterprise and research-based applications.
A study published earlier this year also found that even AI models developed by Chinese companies struggle with minority languages within China, similar to how Western models perform. Both proprietary models from major players like Google, OpenAI, Meta, and Anthropic, as well as open-source ones, continue to face similar challenges.
One of the main reasons behind this problem is the shortage of large, high-quality datasets in regional and minority languages. To address this, several tech giants have partnered with Indian institutions to expand Indic language datasets. For instance, Google has teamed up with IIT Bombay to develop speech models, Meta is reportedly paying contractors to train its systems in Hindi, and OpenAI has launched a research collaboration with IIT Madras, supported by $500,000.
While collecting such data remains costly, it is expected to improve model performance in major Asian languages over time. However, minority and non-scheduled Indian languages are likely to remain difficult for AI systems to master, limiting accessibility and inclusivity in global AI applications.
Also read: Viksit Workforce for a Viksit Bharat
Do Follow: The Mainstream formerly known as CIO News LinkedIn Account | The Mainstream formerly known as CIO News Facebook | The Mainstream formerly known as CIO News Youtube | The Mainstream formerly known as CIO News Twitter
About us:
The Mainstream formerly known as CIO News is a premier platform dedicated to delivering latest news, updates, and insights from the tech industry. With its strong foundation of intellectual property and thought leadership, the platform is well-positioned to stay ahead of the curve and lead conversations about how technology shapes our world. From its early days as CIO News to its rebranding as The Mainstream on November 28, 2024, it has been expanding its global reach, targeting key markets in the Middle East & Africa, ASEAN, the USA, and the UK. The Mainstream is a vision to put technology at the center of every conversation, inspiring professionals and organizations to embrace the future of tech.



