NortheastNER is a token classification model built on XLM-RoBERTa and fine-tuned on ~25k sentences from gazetteers, news, and cultural texts across Northeast India. It detects region-specific entities, places, tribes, festivals, tourist sites, flora, fauna, and experimental local names; ideal for low-resource NER, regional search, cultural analytics, and knowledge graph applications.
NortheastNER is a region-aware Named Entity Recognition (NER) model developed by MWire Labs to address the severe low-resource gap in Northeast India’s digital ecosystem. Built on XLM-RoBERTa and fine-tuned on ~25k sentences sourced from curated gazetteers, cultural corpora, tourism portals, and news datasets, the model is designed to accurately capture the unique geography, communities, and cultural identity of the Northeast. It supports the following entity classes: PLACES: districts, towns, cities, villages TRIBES: indigenous communities and ethnic groups FESTIVALS: cultural and seasonal events TOURIST: landmarks, viewpoints, natural attractions FLORA & FAUNA: regional biodiversity terms NAMES (experimental): local person/community names Use Cases: Regional NER for Northeast India Cultural and tourism analytics Smart search and recommendation systems Media and news entity extraction Knowledge graph and metadata enrichment Low-resource NLP research and benchmarking Performance: The model achieves an overall F1 score of 0.964 on an internal Northeast corpus, with particularly strong results on PLACES and TRIBES. Some classes like TOURIST and FAUNA have limited examples and are marked as beta. Why this model matters: Most mainstream NER systems fail to capture Northeast India’s cultural terms, tribal names, and location diversity. NortheastNER fills this gap by providing a domain-tuned, culturally aligned model optimized for practical deployment across governance, tourism, media, and knowledge applications.
Attribution-Non-Commercial 4.0 International (CC BY-NC 4.0)
MWirelabs
Named Entity Recognition (NER) Model
PyTorch
Open
Arts, Culture and Tourism
19/11/25 11:07:47
0
Attribution-Non-Commercial 4.0 International (CC BY-NC 4.0)
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.