It is a multilingual, sequence-to-sequence pre-trained model fine-tuned on the IndicParaphrase dataset for paraphrase generation across 11 Indian languages.
IndicParaphrase is a sequence-to-sequence pre-trained model based on IndicBART, fine-tuned on the IndicParaphrase dataset, supporting 11 Indian languages: Hindi, Marathi, Punjabi, Tamil, Telugu, Bengali, Gujarati etc. With a smaller size than mBART and mT5, it is more efficient for decoding. Trained on a vast corpus of 5.53 million sentences, it represents all languages in Devanagari script, fostering transfer learning between linguistically related languages.
MIT
Aman Kumar, Himani Shrotriya, Prachi Sahu, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, Amogh Mishra, Mitesh M. Khapra, Pratyush Kumar
Text Generation
N.A.
Open
Sector Agnostic
21/02/25 13:20:58
0
MIT
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.