Back to all sessions
Saturday, April 11, 20262:00 PM - 2:45 PM

Inside Sarvam's 105B Journey: Scaling LLMs from Pretraining to Multimodality

A technical deep-dive by the Sarvam engineering team on how they scaled large language models from early pretraining to a 105B Mixture-of-Experts system. The session covers key challenges across large-scale pretraining, reinforcement learning, and extending models to multimodal capabilities, including vision and speech — with a focus on real engineering trade-offs and lessons learned.

Speakers

Aditya Mehndiratta
Aditya MehndirattaSenior ML EngineerSarvam AI
Harsh Maheshwari
Harsh MaheshwariML ResearcherSarvam AI
Manav Singhal
Manav SinghalML EngineerSarvam AI
Sumanth Doddapaneni
Sumanth DoddapaneniML ResearcherSarvam AI