Indian Flag
Government Of India
A-
A
A+
Lahaja

Lahaja

A Robust Multi-accent Benchmark for Evaluating Hindi ASR Systems Overview

About Dataset

Hindi, one of the most spoken language of India, exhibits a diverse array of accents due to its usage among individuals from diverse linguistic origins. To enable a robust evaluation of Hindi ASR systems on multiple accents, we create a benchmark, LAHAJA, which contains read and extempore speech on a diverse set of topics and use cases, with a total of 12.5 hours of Hindi… See the full description on the dataset page: https://huggingface.co/datasets/ai4bharat/Lahaja.

Activity Overview Activity Overview

  • Downloads0
  • Redirect 11
  • Views 41
  • File Size 0

Tags Tags

  • Speech Dataset
  • benchmark dataset

License Control License Control

Attribution 4.0 International (CC BY- 4.0)