Indian Flag
Government Of India
A-
A
A+
Bengali ASR Benchmark Dataset (IndicTTS Bengali)

Bengali ASR Benchmark Dataset (IndicTTS Bengali)

Bengali ASR (Automatic Speech Recognition) benchmark dataset from Bhashini for supporting the development of robust regional speech recognition systems.

About Dataset

This is a Bengali ASR benchmark dataset, designed to evaluate and improve Automatic Speech Recognition (ASR) systems in challenging scenarios, particularly in the news and general domains. The dataset contains diverse and high-quality audio samples to aid in the development of robust ASR systems tailored for Bengali. Submitted by Microsoft, this dataset is an essential resource for researchers and developers aiming to enhance ASR capabilities for regional languages.

Activity Overview Activity Overview

  • Downloads0
  • Downloads 36
  • Views 263
  • File Size 17.01 MB

Tags Tags

  • Bengali
  • NLP Dataset
  • Benchmark
  • News Domain
  • General Domain
  • Automatic Speech Recognition
  • Speech Technology
  • AI4Bharat
  • ASR
  • Regional Languages
  • Audio Processing

License Control License Control

Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

data.json ( 28.25 KB )


To preview this file, you need to be a registered user. Please complete the registration process to gain access and continue viewing the content.

Data Quality Score BetaData Quality Score Beta

Version Control Version Control

FolderVersion 1(17.01 MB)
  • admin·11 month(s) ago
    • application/json
      data.json
    • application/json
      params.json
    • audio/wav
      train_bengalifemale_00100.wav
    • audio/wav
      train_bengalifemale_00152.wav
    • audio/wav
      train_bengalifemale_00424.wav
    • audio/wav
      train_bengalifemale_00538.wav
    • audio/wav
      train_bengalifemale_00545.wav
    • audio/wav
      train_bengalifemale_00660.wav
    • audio/wav
      train_bengalifemale_00706.wav
    • audio/wav
      train_bengalifemale_00715.wav
    • more_horiz 92 more