Article: Paras Chopra’s Lossfunk gets AI models to speak Tulu through prompts, not training

Meridiano

Add Article

Image for Article: Paras Chopra’s Lossfunk gets AI models to speak Tulu through prompts, not training

Article Details

Title

Article: Paras Chopra’s Lossfunk gets AI models to speak Tulu through prompts, not training

Impact Score

5 / 10

AI Summary (Processed Content)

AI research lab Lossfunk has developed a prompting method that enables large language models to generate text in Tulu, a low-resource Indian language, without prior training on it. The technique uses a detailed, multi-layered prompt incorporating grammar rules and negative constraints to avoid words from dominant languages like Kannada.

This approach achieved up to 85% grammatical accuracy across several major AI models, a significant improvement from an initial 18% accuracy with high language contamination. The success suggests the models are applying the provided linguistic structure rather than memorizing data.

The development could serve as a template for incorporating other underrepresented languages into AI systems, potentially bypassing the need for expensive data collection and specialized model training.

Main Topics: AI research, low-resource languages, prompt engineering, Tulu language, grammatical accuracy, AI adoption in India.

Original URL

https://economictimes.indiatimes.com/tech/artificial-intelligence/paras-chopras-lossfunk-gets-ai-models-to-speak-tulu-through-prompts-not-training/articleshow/129432315.cms

Source Feed

Tech-Economic Times

Published Date

2026-03-11 02:37

Fetched Date

2026-03-11 00:30

Processed Date

2026-03-11 00:31

Embedding Status

Present

Cluster ID

Not Clustered

Raw Extracted Content

AI research lab Lossfunk, started by software company Wingifyâs cofounder Paras Chopra, has developed a method that enables large language models to generate text in Tulu, a coastal Karnataka language spoken by around two million people, without any prior training in the language.
Chopra unveiled the research on X, saying the team made LLMs âspeak Tuluâ by applying negative constraints explicitly listing words the model should avoid which significantly improved the output.
The approach achieved nearly 85% grammatical accuracy, despite the models not being trained on Tulu data, Chopra said.
The development could have wider implications for AI adoption in a linguistically diverse country like India, where many regional languages remain underrepresented in global AI systems.
In AI development, Tulu is considered a low-resource language, with minimal online presence and almost no training data. As a result, models tend to default to more dominant regional languages such as Kannada.
To address this, Lossfunk researchers designed a five-layer prompt, roughly 2,800 tokens long, guiding the model step by step. Initial tests showed only 18% grammatical accuracy, with 80% contamination from Kannada.
After adding Tulu grammar rules, a list of forbidden Kannada words, and a self-verification checklist, accuracy rose to 85%, while contamination dropped to 5%. The structured prompt delivered strong results across multiple models, including Gemini 2.0 Flash (85%), GPT-4o (82%), and Llama 3.1 70B (78%).
Accuracy fell by nearly 50 percentage points when researchers intentionally replaced correct grammar rules with incorrect ones, suggesting the models were applying the linguistic structure rather than memorising examples. Evaluation by three native Tulu speakers produced an agreement score of 0.72.
The method could serve as a template for bringing other low-resource Indian languages into AI systems without requiring expensive data collection or specialised model training. Most AI models used in India today are primarily trained on languages such as Hindi, Marathi, Kannada, Tamil and Malayalam.
Lossfunk argues that prompt engineering alone may be enough to push LLMs to reason in languages they were never trained on, or for which training data is scarce.
Chopra founded the Bengaluru-based research lab after exiting software firm Wingify in January 2025.
Speaking at the ET AI Awards 2025 in February, he highlighted Indiaâs infrastructure gap and called on companies to invest more in foundational research. âI have no idea why large companies in India give dividends instead of doing the kind of research that used to happen,â he said. Founders who have had successful exits, he added, should reinvest in deeper scientific work. âThat level zero is whatâs needed.â
Chopra also stressed the need for long-term research and a shift in mindset among companies, founders and corporate leaders to rethink how India approaches innovation.
Chopra unveiled the research on X, saying the team made LLMs âspeak Tuluâ by applying negative constraints explicitly listing words the model should avoid which significantly improved the output.
The approach achieved nearly 85% grammatical accuracy, despite the models not being trained on Tulu data, Chopra said.
The development could have wider implications for AI adoption in a linguistically diverse country like India, where many regional languages remain underrepresented in global AI systems.
In AI development, Tulu is considered a low-resource language, with minimal online presence and almost no training data. As a result, models tend to default to more dominant regional languages such as Kannada.
To address this, Lossfunk researchers designed a five-layer prompt, roughly 2,800 tokens long, guiding the model step by step. Initial tests showed only 18% grammatical accuracy, with 80% contamination from Kannada.
After adding Tulu grammar rules, a list of forbidden Kannada words, and a self-verification checklist, accuracy rose to 85%, while contamination dropped to 5%. The structured prompt delivered strong results across multiple models, including Gemini 2.0 Flash (85%), GPT-4o (82%), and Llama 3.1 70B (78%).
Accuracy fell by nearly 50 percentage points when researchers intentionally replaced correct grammar rules with incorrect ones, suggesting the models were applying the linguistic structure rather than memorising examples. Evaluation by three native Tulu speakers produced an agreement score of 0.72.
The method could serve as a template for bringing other low-resource Indian languages into AI systems without requiring expensive data collection or specialised model training. Most AI models used in India today are primarily trained on languages such as Hindi, Marathi, Kannada, Tamil and Malayalam.
Lossfunk argues that prompt engineering alone may be enough to push LLMs to reason in languages they were never trained on, or for which training data is scarce.
Chopra founded the Bengaluru-based research lab after exiting software firm Wingify in January 2025.
Speaking at the ET AI Awards 2025 in February, he highlighted Indiaâs infrastructure gap and called on companies to invest more in foundational research. âI have no idea why large companies in India give dividends instead of doing the kind of research that used to happen,â he said. Founders who have had successful exits, he added, should reinvest in deeper scientific work. âThat level zero is whatâs needed.â
Chopra also stressed the need for long-term research and a shift in mindset among companies, founders and corporate leaders to rethink how India approaches innovation.