Indian Flag
Government Of India
A-
A
A+
MILU - Multi-task Indic LLM performance evaluation dataset

MILU - Multi-task Indic LLM performance evaluation dataset

MILU is a comprehensive benchmark dataset designed to evaluate the performance of Large Language Models (LLMs) across 11 Indic languages. It spans 8 domains and 41 subjects, covering ~80,000 multiple - choice questions with culturally relevant knowledge from India.

About Dataset

The MILU (Multi-task Indic Language Understanding Benchmark) dataset is a large-scale evaluation dataset intended to assess the performance of multilingual Large Language Models (LLMs) in the context of Indic languages. It covers 11 Indian languages, including Hindi, Bengali, Gujarati, Kannada, Malayalam, Marathi, Odia, Punjabi, Tamil, Telugu, and English. The dataset spans 8 diverse domains such as Arts & Humanities, Social Sciences, STEM, and Business, containing questions from 41 different subjects. With approximately 80,000 multiple-choice questions and a validation set of 8,933 samples, MILU provides a rigorous benchmark for evaluating language understanding across diverse linguistic and knowledge domains. It incorporates culturally specific knowledge from Indian regional and state-level examinations, making it an essential dataset for LLM evaluation in the Indian linguistic context. The dataset is open-source and available under a CC-BY-4.0 license.

Activity Overview Activity Overview

  • Downloads0
  • Redirect 31
  • Views 565
  • File Size 0

Tags Tags

  • Multilingual Dataset
  • NLP
  • Indic Languages
  • benchmark dataset
  • LLM evaluation
  • question-answering
  • knowledge assessment
  • language understanding
  • Indian language processing

License Control License Control

Attribution 4.0 International (CC BY- 4.0)