
The OVA Odia Poetry Dataset is a curated collection of line-level poetic text extracted from 490 Odia poetry works digitized by the Odia Virtual Academy (OVA). It represents a wide range of poetic traditions, themes, and literary styles within Odia literature. The dataset is developed to support language modeling, NLP research, generative AI training and the digital preservation of Odia poetic heritage.
The OVA Odia Poetry Dataset is a comprehensive, machine-readable collection of Odia poetic literature comprising 490 poetry works digitized and curated by the Odia Virtual Academy (OVA). Designed specifically to support the advancement of artificial intelligence and natural language processing for low-resource Indian languages, this dataset represents a significant step toward strengthening the digital and computational presence of Odia. The dataset captures the diversity, depth, and historical breadth of Odia poetry, reflecting a wide range of poetic traditions, literary movements, themes, and stylistic expressions. From classical and devotional verse to modern, nationalist, and experimental poetry, the corpus embodies the evolution of Odia poetic expression across time. This diversity makes the dataset uniquely valuable for building robust language models that can understand not only contemporary Odia usage but also its rich literary and cultural foundations. A defining feature of the OVA Odia Poetry Dataset is its line-level structure. Each poetic line is preserved as an independent data entry, enabling fine-grained linguistic and stylistic analysis. This structural choice is particularly important for computational modeling of poetry, as it supports the study of metre, rhyme, rhythm, syntactic variation, and semantic density at the level most natural to poetic composition. Such granularity is essential for training generative AI systems capable of producing coherent and stylistically faithful Odia verse, as well as for tasks such as poetic form recognition, automatic summarisation, and literary pattern analysis.
The Purpose Of The Ova Odia Poetry Dataset Is To Enable The Development Of High-quality Artificial Intelligence And Language Technologies For Odia By Providing A Structured, Machine-readable Corpus Of Poetic Literature. It Aims To Support Large Language Model Training, Nlp Research, And Ai Applications.
Attribution 4.0 International (CC BY- 4.0)
1 directories
1 directories
1 directories
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.