Indian Flag
Government Of India
A-
A
A+
Hindi ASR Benchmark Dataset for News and General Domains (Kathbath hard Hindi)

Hindi ASR Benchmark Dataset for News and General Domains (Kathbath hard Hindi)

Hindi ASR (Automatic Speech Recognition) benchmark dataset from Bhashini for news and general domains, supporting the development of robust regional speech recognition systems.

About Dataset

This is a Hindi ASR benchmark dataset specifically designed to evaluate and improve Automatic Speech Recognition (ASR) systems in challenging scenarios, particularly in the news and general domains. The dataset includes diverse and high-quality audio samples, focusing on topics such as current affairs, media reports, and public dialogues. This provides researchers and developers with a critical resource for building robust ASR models. Submitted by AI4Bharat, it supports advancements in speech recognition technologies for regional languages.

Activity Overview Activity Overview

  • Downloads0
  • Downloads 21
  • Views 337
  • File Size 330 MB

Tags Tags

  • NLP Dataset
  • Hindi
  • Benchmark
  • News Domain
  • General Domain
  • Automatic Speech Recognition
  • Speech Technology
  • AI4Bharat
  • ASR
  • Regional Languages
  • Audio Processing

License Control License Control

Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

844424930324566-536-m.wav ( 119.77 KB )


To preview this file, you need to be a registered user. Please complete the registration process to gain access and continue viewing the content.

Data Quality Score BetaData Quality Score Beta

Version Control Version Control

FolderVersion 1(330 MB)
  • admin·1 year(s) ago
    • audio/wav
      844424930324566-536-m.wav
    • audio/wav
      844424930324567-536-m.wav
    • audio/wav
      844424930324569-536-m.wav
    • audio/wav
      844424930324570-536-m.wav
    • audio/wav
      844424930324573-536-m.wav
    • audio/wav
      844424930324575-536-m.wav
    • audio/wav
      844424930324576-536-m.wav
    • audio/wav
      844424930324577-536-m.wav
    • audio/wav
      844424930324580-536-m.wav
    • audio/wav
      844424930324582-536-m.wav
    • more_horiz 1921 more