Aksharantar is the largest publicly available transliteration dataset for 20 Indic languages
Aksharantar is the largest publicly available transliteration dataset for 20 Indic languages. The corpus has 26M Indic language-English transliteration pairs.
| Assamese (asm) | Hindi (hin) | Maithili (mai) | Marathi (mar) | Punjabi (pan) | Tamil (tam) |
| Bengali (ben) | Kannada (kan) | Malayalam (mal) | Nepali (nep) | Sanskrit (san) | Telugu (tel) |
| Bodo(brx) | Kashmiri (kas) | Manipuri (mni) | Oriya (ori) | Sindhi (snd) | Urdu (urd) |
| Gujarati (guj) | Konkani (kok) | Dogri (doi) |
Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.