Indian Flag
Government Of India
A-
A
A+
Aksharantar

Aksharantar

Aksharantar is the largest publicly available transliteration dataset for 20 Indic languages

About Dataset

Dataset Summary

Aksharantar is the largest publicly available transliteration dataset for 20 Indic languages. The corpus has 26M Indic language-English transliteration pairs.


Languages

Assamese (asm) Hindi (hin) Maithili (mai) Marathi (mar) Punjabi (pan) Tamil (tam)
Bengali (ben) Kannada (kan) Malayalam (mal) Nepali (nep) Sanskrit (san) Telugu (tel)
Bodo(brx) Kashmiri (kas) Manipuri (mni) Oriya (ori) Sindhi (snd) Urdu (urd)
Gujarati (guj) Konkani (kok) Dogri (doi)

Activity Overview Activity Overview

  • Downloads1
  • Redirect 72
  • Views 405
  • File Size 0

Tags Tags

  • transliteration
  • multilingual corpus
  • Indic Languages

License Control License Control

Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)