Indian Flag
Government Of India
A-
A
A+
Hindi ASR Benchmark Dataset (Kathbath Hindi Test Unknown)

Hindi ASR Benchmark Dataset (Kathbath Hindi Test Unknown)

Hindi ASR (Automatic Speech Recognition) benchmark test dataset from Bhashini for supporting the development of robust regional speech recognition systems.

About Dataset

The Kathbath-Hindi-Test-Unknown_1 dataset serves as a robust Hindi ASR benchmark dataset, designed to test and improve Automatic Speech Recognition (ASR) systems across various general scenarios. With 1684 hours of labeled speech data encompassing 12 Indian languages, this dataset provides a valuable resource for researchers and developers aiming to advance speech recognition technologies for Hindi and other Indian regional languages. Submitted by Tahir Javed, this dataset supports significant progress in ASR model development for multilingual and regional language settings.

Activity Overview Activity Overview

  • Downloads0
  • Downloads 26
  • Views 162
  • File Size 330.12 MB

Tags Tags

  • NLP Dataset
  • Hindi
  • Benchmark
  • General Domain
  • Automatic Speech Recognition
  • Speech Technology
  • ASR
  • Regional Languages
  • Indian Languages
  • Multilingual Dataset
  • Audio Processing

License Control License Control

Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

844424930324566-536-m.wav ( 119.80 KB )


To preview this file, you need to be a registered user. Please complete the registration process to gain access and continue viewing the content.

Data Quality Score BetaData Quality Score Beta

Version Control Version Control

FolderVersion 1(330.12 MB)
  • admin·11 month(s) ago
    • chevron_rightFolder
      audios
      • audio/wav
        844424930324566-536-m.wav
      • audio/wav
        844424930324567-536-m.wav
      • audio/wav
        844424930324569-536-m.wav
      • audio/wav
        844424930324570-536-m.wav
      • audio/wav
        844424930324573-536-m.wav
      • audio/wav
        844424930324575-536-m.wav
      • audio/wav
        844424930324576-536-m.wav
      • audio/wav
        844424930324577-536-m.wav
      • audio/wav
        844424930324580-536-m.wav
      • audio/wav
        844424930324582-536-m.wav
      • more_horiz 1919 more
    • application/json
      data.json
    • application/json
      params.json