Indigenous Large Language Models and India’s Sovereign AI Push

  • 27 Feb 2026

In News:

At the India-AI Impact Summit 2026, Bengaluru-based startup Sarvam AI unveiled two indigenous Large Language Models (LLMs)** trained on 35 billion and 105 billion parameters. These models are designed to be less power- and compute-intensive, while demonstrating improved performance in Indian languages. The development marks a significant milestone in India’s quest for sovereign and cost-efficient AI systems aligned with domestic needs.

Understanding Large Language Models (LLMs)

A Large Language Model (LLM) is an AI system built using transformer-based neural networks trained on massive text datasets to understand and generate human language.

They contain billions of parameters—internal variables learned during training—that help the model predict the next word in a sequence and generate coherent text.

How LLMs Work

  1. Tokenisation: Text is broken into smaller units called tokens (word pieces or characters).
  2. Embeddings & Transformer Architecture: Tokens are converted into numerical vectors. The self-attention mechanism helps the model determine which words in a sentence are contextually important, even if they are far apart.
  3. Next-Token Prediction: The model generates language by predicting one token at a time based on probability distributions.
  4. Layered Learning: Multiple transformer layers refine linguistic and semantic understanding—from grammar to reasoning patterns.

Stages of Training LLMs

1. Data Collection & Pre-processing

  • Massive datasets sourced from books, websites, code repositories, etc.
  • Cleaning to remove bias, spam, duplicates, and harmful content.
  • Quality of data directly influences performance.

2. Pre-training (Self-Supervised Learning)

  • Model learns via next-token prediction.
  • Produces a base model capable of understanding grammar, facts, and reasoning.

3. Supervised Fine-Tuning

  • Trained on curated prompt–response pairs.
  • Enhances instruction-following ability and task performance (summarization, translation, Q&A).

4. Alignment via RLHF

  • Reinforcement Learning from Human Feedback (RLHF).
  • Humans rank outputs based on safety and quality.
  • A reward model optimises responses to align with human values.

Challenges in Training LLMs in India

  • Data Scarcity in Indian Languages
    • English and East Asian languages dominate internet data.
    • Indian languages remain underrepresented.
    • Many models rely on translation into English, increasing token consumption and cost.
  • High Capital and Compute Costs
    • Requires clusters of Graphics Processing Units (GPUs).
    • Training costs run into millions of dollars.
    • Limited domestic venture capital for foundational AI research.
  • Limited Immediate Commercial Use Cases: Training large models without clear monetisation pathways deters investment.
  • Infrastructure Constraints
    • Dependence on imported high-end chips.
    • Energy-intensive training processes.

Innovation: Mixture of Experts (MoE) Architecture

Earlier LLMs activated all parameters during inference, making them computationally expensive.

The Mixture of Experts (MoE) architecture activates only a subset of parameters (“experts”) for each query.

Advantages:

  • Reduced computational load
  • Faster inference
  • Lower electricity consumption
  • Cost-efficient deployment in resource-constrained settings

Sarvam’s 105B parameter model leverages MoE to balance performance with efficiency, focusing on accuracy and Indian context alignment rather than sheer scale.

IndiaAI Mission: Government Support for Domestic AI

Launched in March 2024 with an outlay of ?10,372 crore, the IndiaAI Mission aims to build a comprehensive AI ecosystem.

Key Components:

  • Compute Infrastructure
    • Over 36,000 GPUs commissioned in Indian data centres.
    • Additional 20,000 GPUs being added.
    • Target: 100,000 GPUs by end of 2026.
  • Subsidised Access
    • Sarvam AI granted 4,096 GPUs from a common compute cluster.
    • Subsidy estimated at nearly ?100 crore.
    • Cluster cost approximately ?246 crore.
  • Support for Innovation
    • Promotion of sovereign foundational models trained on Indian datasets.
    • Financial support covering compute and training costs.
    • Encouragement of open-source innovation.
  • Talent Development
    • Training support for over 13,500 students.
    • Establishment of India Data and AI Labs.

Other Indian Efforts

  • BharatGen (IIT Bombay-incubated): Multilingual 17B parameter model, targeted at sectors like education and healthcare.
  • Gnani.ai: Small text-to-speech model.
  • Indigenous focus on domain-specific and language-specific AI models.