Indian Flag
Government Of India
A-
A
A+
Malayalam ASR Benchmark Dataset: Kathbath Malayalam Test Known

Malayalam ASR Benchmark Dataset: Kathbath Malayalam Test Known

Malayalam ASR (Automatic Speech Recognition) benchmark test dataset from Bhashini for supporting the development of robust regional speech recognition systems.

About Dataset

The Kathbath-Malayalam-Test-Known dataset is a specialized benchmark designed for testing the performance of Automatic Speech Recognition (ASR) systems in Malayalam. Featuring 1684 hours of labeled speech data across 12 Indian languages, this dataset is tailored for evaluating ASR models in general domains. Submitted by Tahir Javed, it provides a vital resource for advancing speech recognition technologies in Malayalam and other Indian regional languages, contributing to robust multilingual ASR solutions.

Activity Overview Activity Overview

  • Downloads0
  • Downloads 9
  • Views 107
  • File Size 552.94 MB

Tags Tags

  • NLP Dataset
  • Benchmark
  • General Domain
  • Automatic Speech Recognition
  • Malayalam
  • Speech Technology
  • ASR
  • Regional Languages
  • Indian Languages
  • Multilingual Dataset
  • Audio Processing

License Control License Control

Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

844424930309504-1145-m.wav ( 395.54 KB )


To preview this file, you need to be a registered user. Please complete the registration process to gain access and continue viewing the content.

Data Quality Score BetaData Quality Score Beta

Version Control Version Control

FolderVersion 1(552.94 MB)
  • admin·11 month(s) ago
    • chevron_rightFolder
      audios
      • audio/wav
        844424930309504-1145-m.wav
      • audio/wav
        844424930310355-814-f.wav
      • audio/wav
        844424930310359-814-f.wav
      • audio/wav
        844424930310364-814-f.wav
      • audio/wav
        844424930310365-814-f.wav
      • audio/wav
        844424930310366-814-f.wav
      • audio/wav
        844424930310398-814-f.wav
      • audio/wav
        844424930310401-814-f.wav
      • audio/wav
        844424930310402-814-f.wav
      • audio/wav
        844424930310413-814-f.wav
      • more_horiz 1757 more
    • application/json
      data.json
    • application/json
      params.json