
The Major Institutions Dataset is a curated collection of Odia text developed by the Odia Virtual Academy (OVA). It covers prominent institutions across governance, education, finance, science, culture, and public administration, including regulatory authorities, universities, research organizations, commissions, and public sector establishments. The dataset encompasses organizational structures, administrative functions, and institutional studies and academic research in Odia.
The Major Institutions Dataset is a meticulously curated collection of Odia text developed by the Odia Virtual Academy (OVA). It brings together material on leading institutions across governance, education, finance, science, culture, and public administration, reflecting the breadth and structural complexity of these interconnected domains. The dataset encompasses constitutional bodies, statutory commissions, universities, research centers, regulatory authorities, financial institutions, and public sector organizations, documenting their historical evolution, organizational frameworks, statutory mandates, administrative systems, and functional responsibilities. It further covers institutional governance models, leadership structures, public accountability mechanisms, policy coordination systems, inter-institutional collaborations, and community-oriented institutional initiatives embedded within Odia-speaking regions. Curated by the Odia Virtual Academy, the dataset emphasizes domain-relevant terminology, institutional design vocabulary, procedural expressions, and sector-specific administrative language to ensure linguistic precision and contextual authenticity. It supports advanced language modelling, named-entity recognition, institutional relationship mapping, information extraction, and domain adaptation for Odia NLP research. By systematically representing major institutional knowledge, the dataset strengthens digital scholarship, AI-assisted education, and structured knowledge development in Odia across academic, governmental, and research-oriented applications.
The Major Institutions Dataset Aims To Provide Odia Text Sourced From Digitized Institutional Records, Annual Reports, Policy Documents, And Related Academic Materials. It Is Suitable For A Range Of Applications, Including Training Language Models, Building Domain-aware Nlp Tools (Such As Named-entity Recognition For Universities, Research Centers, Cultural Organizations, Financial Bodies, Regulatory Authorities, And Public Service Institutions; Relation Extraction; And Multilingual Grounding), Implementing Ai-assisted Institutional Studies, Education, And Professional Training For Students And Administrators, And Enabling Content Generation That Aligns With Organizational And Institutional Contexts In Odia.
Attribution 4.0 International (CC BY- 4.0)
12 files
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.