Indian AI startup Sarvam has open-sourced its two reasoning models, Sarvam 30B and Sarvam 105B. These models, trained from scratch in India with government-supported compute resources, excel in Indian languages and perform well on global benchmarks.
The models use a Mixture-of-Experts (MoE) architecture for computational efficiency, with the 30B model designed for practical deployment and the 105B flagship model built for enterprise-grade applications and complex reasoning. Open-sourcing the models aims to foster transparency, lower barriers to AI development, and accelerate innovation.
The release is part of a full-stack AI effort, with the startup having developed the complete training and deployment stack. The models already power the company's in-house offerings: the Samvaad conversational platform and the Indus AI assistant.
Main topics: Sarvam's model release, model specifications and architecture, the significance of open-sourcing, training and development context, and intended applications.