Sarvam AI’s vision OCR and Bulbul V3 put India on Global AI map, impress experts
- In Reports
- 07:07 PM, Feb 09, 2026
- Myind Staff
India is finally getting global attention in the field of artificial intelligence, as Bengaluru-based startup Sarvam AI has launched new AI tools that are being called world-class for India-specific tasks. The company has introduced an OCR model called Sarvam Vision and a voice generation model called Bulbul V3, both of which are generating strong buzz for their performance in Indian languages.
While the US and China usually dominate AI development, India has rarely been viewed as a major player in core AI model building. Sarvam AI, however, is trying to change this perception by building what it calls a “sovereign AI”, meaning foundational AI models created from scratch within India.
This week, Sarvam’s latest tools have attracted global attention, with users and experts praising their performance.
Sarvam Vision is an AI tool focused on optical character recognition (OCR). According to Sarvam AI, the tool is performing better than major global AI models like ChatGPT, Google Gemini, and Anthropic Claude in certain OCR benchmarks.
Sarvam AI co-founder Pratyush Kumar shared details of the model’s achievements in a series of posts on X.
The company stated that Sarvam Vision scored 84.3 per cent accuracy on the olmOCR-Bench, which is a benchmark designed to test OCR performance. This score is reportedly higher than Gemini 3 Pro and other OCR tools like DeepSeek OCR v2, while ChatGPT’s ranking was significantly lower.
Sarvam Vision also performed strongly on another benchmark called OmniDocBench v1.5, which measures how well AI systems can read and understand real-world documents. Sarvam Vision achieved an overall score of 93.28 per cent in this benchmark.
The model performed especially well in areas that traditional OCR tools often struggle with, such as complex layouts, technical tables, and mathematical formulas. These document formats are usually difficult for OCR systems because they involve messy formatting and dense content.
Sarvam’s strong performance has led to a shift in how people view the company. Earlier, Sarvam was questioned for focusing on Indic-language AI models. But now, the results are turning that scepticism into approval.
Tech commentator Deedy Das, who had earlier questioned the company’s approach, admitted he had underestimated Sarvam. In a post on X, he praised Sarvam’s progress and the value of their work in Indian languages.
“I was wrong about Sarvam. When I wrote about them a year ago, I felt like the direction to train small Indic language models was wrong. But boy, have they turned it around,” he wrote.
He further added, “They have the best text-to-speech, speech-to-text, and OCR models for Indic languages, and that's actually really valuable. The pricing is very reasonable.”
Sarvam’s work is being seen as important because major global AI labs have largely ignored Indic-language-specific needs, and Sarvam is now filling that gap.
Apart from experts, users are also sharing positive feedback. One user wrote about using Sarvam’s models and said, “I used this a couple of days ago! Oh man wow.”
Such reactions are helping Sarvam gain stronger recognition, not only within India but also globally.
Along with Sarvam Vision, the company has also launched Bulbul V3, a new text-to-speech AI model. Bulbul V3 is designed to generate natural-sounding AI voices, similar to popular voice AI platforms like ElevenLabs.
Sarvam described the model in its blog post, saying, “Today we're releasing Bulbul V3, our most capable text-to-speech model designed to deliver natural, expressive and production-ready voices for Indian languages.”
The company further explained the key improvement of the model, stating, “Bulbul V3 minimizes failure modes, delivering content-accurate, stable speech across the inputs that matter for India-specific use cases.”
Currently, Bulbul V3 supports more than 35 voices across 11 Indian languages. Sarvam also plans to expand the model’s language support in the future, aiming to cover 22 languages in total.
Bulbul V3 is also receiving praise from users and industry experts. Pratik Desai, founder of KissanAI, shared his opinion on X and said, “We use Bulbul as our go-to tts model for our Indic use cases, and they have just gotten better with each release. Meanwhile, ElevenLabs cost never made sense for Indic or any other languages.”
This statement reflects how Bulbul is being viewed as a practical and affordable option for Indian language voice generation, especially when compared to international tools.
With Sarvam Vision showing strong results in OCR benchmarks and Bulbul V3 offering high-quality AI voice generation, Sarvam AI is emerging as a major name in India’s AI ecosystem. The company’s success is also challenging the idea that India cannot build foundational AI tools at a global level.
Sarvam’s work is being recognised as important because it focuses on Indian languages and real India-specific use cases, which are often overlooked by large international AI companies.
As global interest grows, Sarvam AI is now being seen as a company that could help India become a serious player in the global AI race.

Comments