Indian Flag
Government Of India
A-
A
A+
SPRING INX BENGALI

SPRING INX BENGALI

SPRING_INX, created by SPRING LAB (https://asr.iitm.ac.in/), IIT Madras, led by Prof. S. Umesh, is a large-scale, high-quality Automatic Speech Recognition (ASR) corpus designed to advance multilingual and code-mixed speech research. The dataset spans ~3400 hours of labelled audio collected across 10 major Indian languages, with a sampling rate of 16 kHz, making it readily compatible with modern ASR and SSL-based speech models.

About Dataset

SPRING_INX contains rich, diverse, and naturally produced speech, covering a wide range of speakers, acoustic environments, and linguistic styles. A unique hallmark of this corpus is its code-mixed transcriptions, where native-language text naturally incorporates Romanised and English words - a phenomenon common in Indian bilingual communication. The dataset includes both monologue and conversational recordings, creating realistic scenarios suitable for robust, real-world speech system development. Key Features: ~3400 hours of labelled speech (Phase 1: ~2000h, Phase 2: ~1400h) 10 Indian languages with broad linguistic and acoustic coverage (Assamese, Bengali, Gujarati, Malayalam, Hindi, Marathi, Odia, Punjabi, Tamil, Kannada) 16 kHz high-quality audio compatible with ASR & SSL models (e.g., Wav2Vec2, HuBERT, Whisper) Code-mixed transcripts enabling bilingual, transliteration, and multilingual modelling. Monologue + conversational speech for realistic system training. Large speaker diversity for robust speaker modelling. General-purpose domain content with rich semantic variation. Official ESPnet recipe for easy and reproducible end-to-end ASR training (https://github.com/espnet/espnet/tree/master/egs2/spring_speech)

Purpose of Dataset

The Spring_inx Corpus Is Designed To Serve As A Comprehensive Resource For The Speech And Language Community, Enabling Advancements In: Multilingual And Code-mixed Asr. Ssl Pretraining And Fine-tuning. Transliteration And Bilingual Modelling. Conversational Speech Understanding. General-purpose, Real-world Speech Applications. By Offering A Rare Combination Of Linguistic Richness, Speaker Diversity, And Large-scale Annotated Data, Spring_inx Aims To Accelerate Research And Development In Indic Speech Technology, Fostering Innovation Across Academia, Industry, And Open-source Communities.

Activity Overview Activity Overview

  • Downloads0
  • Downloads 1
  • Views 32
  • File Size 59.87 GB

Tags Tags

  • ASR
  • multilingual-ASR
  • bengali
  • dataset
  • IITM
  • spring_lab

License Control License Control

Attribution 4.0 International (CC BY- 4.0)

final_SPRING_INX_BENGALI_R1.tar ( 1 directories )


Directory
speech

1 directories

Data Quality Score BetaData Quality Score Beta

Version Control Version Control

FolderVersion 1(59.87 GB)
  • admin·2 month(s) ago
    • chevron_rightFolder
      final_SPRING_INX_BENGALI_R1.tar
      • chevron_rightFolder
        speech
    • chevron_rightFolder
      final_SPRING_INX_BENGALI_R2.tar