This dataset is curated to foster the development of Automatic Speech Recognition (ASR) systems, with a special focus on rural Bhojpuri women
This dataset is curated to foster the development of inclusive Automatic Speech Recognition (ASR) systems, with a special focus on the underrepresented voices of rural Bhojpuri women. It contains audio clips in both Bhojpuri and Hindi, collected from real-world and synthetic sources, designed to train and evaluate ASR models that can accurately recognize diverse speech patterns.
This work is part of the research presented in the paper "Recognizing Every Voice: Towards Inclusive ASR for Rural Bhojpuri Women."
Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.