The Kinnauri-Pahari dataset is a collection of general-domain corpora for one of seven endangered languages found in Himachal Pradesh, India. It contains both the monolingual and parallel sentences
This dataset is designed to facilitate the comparison and analysis of Hindi and Kinnauri languages. The dataset contains text in both languages, they have both monolingual dataset that contain kinnauri language and parallel corpus which has hindi and kinnauri translated sentences. The dataset provides a unique comparison of Hindi and Kinnauri languages, showcasing their linguistic differences through texts and translations. The dataset is likely intended for linguistic research, language learning, and cultural exchange purposes. The citation of the dataset - Saxena, Shefali, Shweta Chauhan, and Philemon Daniel. \"Kinnauri-Pahari (version_0. 1): parallel, monolingual dataset and word-embeddings.\" Sādhanā 47, no. 3 (2022): 123. This dataset was identified and facilitated for onboarding as part of the Dataset Onboarding Support Team (DOST) initiative led by by CivicDataLab (CDL), partnering with the Gates Foundation in collaboration with BHASHINI. CivicDataLab provided technical support for dataset discovery, validation, metadata preparation and onboarding facilitation. All dataset ownership and intellectual property rights remain with the original author(s).
The Purpose Of This Dataset Is To Facilitate The Comparison, Analysis, And Preservation Of Hindi And Kinnauri Languages Through Bilingual Text Data. The Dataset Enables Researchers And Scholars To Study The Linguistic Similarities And Differences Between The Two Languages, Including Vocabulary, Grammar, Syntax, And Regional Language Patterns. It Is Particularly Relevant For Linguistic Research, Language Learning, Cultural Studies, And The Development Of Multilingual Language Technologies For Low-resource Himalayan Languages. The Dataset Also Supports The Creation Of Educational Resources, Translation Systems, And Language Learning Tools While Helping Preserve The Linguistic And Cultural Heritage Of The Indian Himalayan Region.
Attribution 4.0 International (CC BY- 4.0)
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.