Bengali ASR (Automatic Speech Recognition) Benchmark noisy test dataset from Bhashini for supporting the development of robust regional speech recognition systems.
The Kathbath-Bengali-Noisy-Test-Unknown dataset is a comprehensive benchmark for evaluating Automatic Speech Recognition (ASR) systems in noisy conditions for the Bengali language. Featuring 1684 hours of labeled speech data across 12 Indian languages, it is specifically designed to test ASR models under challenging acoustic scenarios in general domains. Submitted by Tahir Javed, this dataset is an essential resource for advancing ASR technologies for Bengali and other regional Indian languages, enabling robust multilingual ASR systems capable of handling noise.
Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
To preview this file, you need to be a registered user. Please complete the registration process to gain access and continue viewing the content.
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.