
IndicVoices-R is a large-scale, multilingual, multi-speaker speech dataset for Text-to-Speech (TTS) research in Indian languages.
IndicVoices-R is a comprehensive, multilingual speech dataset designed for Text-to-Speech (TTS) research, covering 22 Indian languages. It includes over 1,700 hours of high-quality, spontaneous speech from more than 10,000 speakers. The dataset is processed with advanced techniques to enhance speech clarity and remove background noise, making it ideal for training TTS models and evaluating speaker generalization. It supports zero-shot, few-shot, and many-shot evaluation metrics for robust TTS model development.
Attribution-No Derivatives 4.0 International (CC BY-ND 4.0)
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.