IndicSeamless is a multilingual, sequence-to-sequence pre-trained model based on SeamlessM4T-v2 and fine-tuned on the BhasaAnuvaad dataset.
IndicSeamless is a multilingual, sequence-to-sequence pre-trained modethat leverages Meta’s state-of-the-art SeamlessM4T-v2 architecture and is fine-tuned on AI4Bharat’s massive BhasaAnuvaad corpus to deliver high-quality STT across 13 Indian languages and English. It preserves SeamlessM4T-v2’s unified handling of multiple modalities and languages while specializing performance on Indic speech data.
Creative Commons Attribution Non Commercial 4.0
Sparsh Jain and Ashwin Sankar and Devilal Choudhary and Dhairya Suman and Nikhil Narasimhan and Mohammed Safi Ur Rahman Khan and Anoop Kunchukuttan and Mitesh M Khapra and Raj Dabre
Automatic Speech Recognition
N.A.
Open
Sector Agnostic
02/05/25 11:01:11
0
Creative Commons Attribution Non Commercial 4.0
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.