Indian Flag
Government Of India
A-
A
A+
Kathbath Hindi ASR Validation Dataset

Kathbath Hindi ASR Validation Dataset

Hindi ASR (Automatic Speech Recognition) benchmark validation dataset from Bhashini for supporting the development of robust regional speech recognition systems.

About Dataset

The Kathbath-Hindi-Valid dataset is an essential Hindi ASR benchmark dataset curated to validate Automatic Speech Recognition (ASR) systems under general scenarios. It consists of 1684 hours of labeled speech data, covering 12 Indian languages, offering a diverse and comprehensive resource for researchers and developers. Submitted by Tahir Javed, this dataset plays a crucial role in enhancing the performance and reliability of speech recognition technologies for Hindi and other regional Indian languages.

Activity Overview Activity Overview

  • Downloads0
  • Downloads 38
  • Views 453
  • File Size 551.04 MB

Tags Tags

  • NLP Dataset
  • Hindi
  • Benchmark
  • General Domain
  • Automatic Speech Recognition
  • Speech Technology
  • ASR
  • Regional Languages
  • Indian Languages
  • Multilingual Dataset
  • Audio Processing
  • Validation Dataset

License Control License Control

Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

844424930501806-229-f.wav ( 137.22 KB )


To preview this file, you need to be a registered user. Please complete the registration process to gain access and continue viewing the content.

Data Quality Score BetaData Quality Score Beta

Version Control Version Control

FolderVersion 1(551.04 MB)
  • admin·11 month(s) ago
    • chevron_rightFolder
      audios
      • audio/wav
        844424930501806-229-f.wav
      • audio/wav
        844424930501835-229-f.wav
      • audio/wav
        844424930501866-229-f.wav
      • audio/wav
        844424930501884-229-f.wav
      • audio/wav
        844424930501891-229-f.wav
      • audio/wav
        844424930524026-252-f.wav
      • audio/wav
        844424930526966-229-f.wav
      • audio/wav
        844424930526973-229-f.wav
      • audio/wav
        844424930545011-229-f.wav
      • audio/wav
        844424930545039-229-f.wav
      • more_horiz 3219 more
    • application/json
      data.json
    • application/json
      params.json