Indian Flag
Government Of India
A-
A
A+
Common voice Hindi ASR Benchmark Dataset for Speech Recognition

Common voice Hindi ASR Benchmark Dataset for Speech Recognition

Hindi ASR (Automatic Speech Recognition) benchmark dataset from Bhashini for supporting the development of robust regional speech recognition systems.

About Dataset

This is a Hindi ASR benchmark dataset developed to evaluate and improve Automatic Speech Recognition (ASR) systems for the Hindi language. The dataset includes diverse and high-quality audio samples, focusing on topics such as culture, daily conversations, and general content. This provides researchers and developers with a critical resource for building robust ASR models. Submitted by Microsoft, this dataset supports advancements in speech recognition technologies for regional languages.

Activity Overview Activity Overview

  • Downloads0
  • Downloads 102
  • Views 1,752
  • File Size 571.04 MB

Tags Tags

  • NLP Dataset
  • Hindi
  • Benchmark
  • General Domain
  • Automatic Speech Recognition
  • Speech Technology
  • AI4Bharat
  • ASR
  • Regional Languages
  • Audio Processing

License Control License Control

Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

common_voice_hi_23795241.wav ( 357.79 KB )


To preview this file, you need to be a registered user. Please complete the registration process to gain access and continue viewing the content.

Data Quality Score BetaData Quality Score Beta

Version Control Version Control

FolderVersion 1(571.04 MB)
  • admin·1 year(s) ago
    • audio/wav
      common_voice_hi_23795241.wav
    • audio/wav
      common_voice_hi_23795243.wav
    • audio/wav
      common_voice_hi_23795244.wav
    • audio/wav
      common_voice_hi_23795248.wav
    • audio/wav
      common_voice_hi_23795249.wav
    • audio/wav
      common_voice_hi_23795250.wav
    • audio/wav
      common_voice_hi_23795251.wav
    • audio/wav
      common_voice_hi_23795252.wav
    • audio/wav
      common_voice_hi_23796156.wav
    • audio/wav
      common_voice_hi_23809701.wav
    • more_horiz 1719 more