Eka-IndicMTEB, is a evaluation dataset comprising Indian Multilingual Medical Terms designed to evaluate embedding models on medical terminology across multiple Indic languages and scripts.
Eka-IndicMTEB, is a evaluation dataset comprising Indian Multilingual Medical Terms designed to evaluate embedding models on medical terminology across multiple Indic languages and scripts. It contains 2,532 doctor-verified queries, capturing the linguistic and domain-specific diversity of the Indian healthcare ecosystem. The dataset includes medical entities spanning symptoms, diagnoses, procedures, medications, and related concepts, enriched with real-world linguistic variations such spelling errors, special characters, abbreviations, and colloquial expressions. The dataset covers multilple languages including English, Hindi, Bengali, Tamil, Telugu, Kannada, Marathi, and Malayalam.
Eka-indicmteb Addresses A Critical Gap In Multilingual Medical Ai Evaluation By Offering: A Shared Evaluation Framework: Researchers Can Now Benchmark Multilingual Medical Embeddings Against A Standardized, Clinically-validated Dataset Spanning Multiple Indian Languages. Insight Into Model Strengths And Weaknesses: The Benchmark Systematically Reveals How Models Handle India's Linguistic Diversity, Identifying Specific Failure Modes And Success Patterns Across Different Language Families And Medical Domains. Guidance For Model Development: Performance Analysis Across Varied Query Types Provides Actionable Insights For Targeted Model Improvements. This Benchmark Is Invaluable For Researchers Developing Cross-lingual Medical Information Retrieval Systems, And Ai Teams Building Multilingual Clinical Decision Support Tools. Healthcare Organizations Deploying Language-agnostic Medical Chatbots Or Semantic Search Systems Will Find This Dataset Essential For Validating Performance Across India's Diverse Linguistic Landscape. Academic Institutions Working On Low-resource Medical Nlp Can Leverage This Benchmark To Identify Gaps And Measure Progress In Indian Language Healthcare Ai.
Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.