Indian Flag
Government Of India
A-
A
A+
ORGANISATION
ParamBench

ParamBench

ParamBench is a graduate-level benchmark dataset for evaluating Large Language Models (LLMs) on India-centric subjects. It contains 17,275 Hindi MCQs across 21 disciplines from competitive exams, enabling assessment of subject knowledge, cultural understanding, and reasoning abilities. Paper: https://arxiv.org/pdf/2508.16185

About Dataset

ParamBench is a large-scale, graduate-level benchmark dataset designed to evaluate the performance of Large Language Models (LLMs) on India-centric subjects and culturally grounded knowledge. The dataset consists of 17,275 multiple-choice questions (MCQs) in Hindi, collected from Indian competitive examination papers and their corresponding answer keys. It spans 21 diverse academic subjects, including Anthropology, Sociology, History, Law, Political Science, Economics, Philosophy, and Indian Culture, providing broad coverage of humanities, social sciences, and domain-specific knowledge. Each data instance represents a single MCQ and includes a question in Hindi, four answer options (A-D), the correct answer label, and metadata such as subject, exam name, and question type. The dataset incorporates multiple question formats Normal MCQ, Assertion-Reason, Match the List, Ordering, Fill in the Blank, and Identify Incorrect Statement- enabling fine-grained evaluation of reasoning and analytical capabilities. All questions are preserved in Hindi, ensuring authentic evaluation of linguistic and cultural understanding without reliance on translation. The dataset is released as a single test split (17,275 samples) and is intended exclusively for evaluation, enabling standardized and reproducible benchmarking of LLMs across subjects and question types. Overall, ParamBench provides a comprehensive and challenging evaluation suite for measuring subject-wise knowledge, cultural awareness, and reasoning ability of modern language models in the Indian context.

Purpose of Dataset

The Primary Purpose Of Parambench Is To Provide A Rigorous, Standardized Benchmark For Evaluating Large Language Models (Llms) On India-centric Subjects, Addressing The Lack Of Culturally Grounded And Non-english Evaluation Datasets. It Enables Comprehensive Assessment Of Subject-specific Knowledge Across 21 Academic Disciplines, While Also Measuring Reasoning Abilities Through Diverse Question Formats Such As Assertion-reason, Match The List, Ordering, Fill In The Blank, And Identify Incorrect Statement. By Using Graduate-level Questions From Competitive Exams, The Dataset Evaluates Both Conceptual Understanding And Analytical Thinking. Parambench Also Aims To Assess Cultural And Contextual Understanding By Focusing On Indian Knowledge Domains Such As History, Philosophy, Law, And Culture, Which Are Often Underrepresented In Existing Benchmarks. By Preserving All Data In Hindi, It Ensures Authentic Evaluation Of Multilingual Capabilities Without Reliance On Translation. Additionally, The Dataset Supports Fine-grained Analysis Through Subject-wise And Question-type Performance, Helping Identify Weaknesses In Model Behavior. As A Test-only Benchmark With A Standardized Evaluation Setup, Parambench Enables Reproducible Comparisons And Provides Insights To Guide Future Research In Multilingual, Culturally Aware Ai Systems.

Activity Overview Activity Overview

  • Downloads0
  • Downloads 0
  • File Size 4.53 MB
  • Views 15

Tags Tags

  • Education
  • Multilingual Evaluation
  • Indian Languages
  • History
  • Hindi
  • Indic Languages
  • Music
  • LLM Benchmark
  • Low Resource Languages
  • Multilingual NLP
  • Language Evaluation
  • Question Answering
  • MCQ Dataset
  • AI Benchmark
  • NLP Benchmark
  • Reasoning Dataset
  • UGC NET Questions
  • Academic QA
  • Indian Knowledge
  • Cultural Knowledge
  • Competitive Exams
  • Subject-wise Evaluation
  • Cross-domain Knowledge
  • Non-English NLP
  • Cultural AI
  • Knowledge Benchmark
  • Reasoning Evaluation
  • Indic NLP
  • Sociology
  • Anthropology
  • Psychology
  • Archaeology
  • Comparative Study of Religions
  • Law
  • Indian Culture
  • Economics
  • Current Affairs
  • Philosophy
  • Political Science
  • Drama and Theatre
  • Rabindra Sangeet
  • Karnatak Music
  • Tribal and Regional Language
  • Percussion Instruments
  • Defence and Strategic Studies
  • Yoga

License Control License Control

Attribution 4.0 International (CC BY- 4.0)

No Record(s) Found

Select a file to preview its contents.

Data Quality Score BetaData Quality Score Beta

Version Control Version Control

FolderVersion 1(4.53 MB)
  • Vivekkumar Vasudevbhai Patel· Today
    • undefined
      ParamBench.parquet

Related Datasets Related Datasets

Updated 6 month(s) ago
BharatGen : MHQA Dataset
BharatGen : MHQA Dataset
Information
MHQA: A Mental Health Solution for Healthcare
HealthCare
AI in healthcare
  • See Upvoters0
  • Downloads274
  • File Size44.06 MB
  • Views3,886

BHARATGEN

Updated 6 month(s) ago
BhashaBench-Finance
BhashaBench-Finance
Information-
BhashaBench-Finance (BBF): Benchmarking AI on Indian Financial Knowledge
library:pandas
language:en
modality:text
library:datasets
region:us
library:polars
library:mlcroissant
format:parquet
license:cc-by-4.0
size_categories:10K<n<100K
source_datasets:original
task_categories:multiple-choice
task_categories:question-answering
arxiv:2510.25409
language:hi
  • See Upvoters1
  • Downloads129
  • File Size0
  • Views948

BHARATGEN

Updated 6 month(s) ago
BhashaBench-Krishi
BhashaBench-Krishi
Information-
BhashaBench-Krishi (BBK): Benchmarking AI on Indian Agricultural Knowledge
language:hi
arxiv:2510.25409
task_categories:question-answering
task_categories:multiple-choice
source_datasets:original
size_categories:10K<n<100K
license:cc-by-4.0
format:parquet
library:mlcroissant
library:polars
region:us
library:datasets
modality:text
language:en
library:pandas
  • See Upvoters0
  • Downloads31
  • File Size0
  • Views193

BHARATGEN

Updated 6 month(s) ago
BhashaBench-Legal
BhashaBench-Legal
Information-
BhashaBench-Legal (BBL): Benchmarking AI on Indian Legal Knowledge
library:pandas
language:hi
language:en
modality:text
library:datasets
region:us
library:polars
library:mlcroissant
format:parquet
license:cc-by-4.0
size_categories:10K<n<100K
source_datasets:original
task_categories:multiple-choice
task_categories:question-answering
arxiv:2510.25409
  • See Upvoters1
  • Downloads98
  • File Size0
  • Views954

BHARATGEN

Updated 6 month(s) ago
BhashaBench-Ayur
BhashaBench-Ayur
Information-
BhashaBench-Ayur (BBA): Pioneering India’s Ayurvedic AI Benchmark
library:pandas
arxiv:2510.25409
task_categories:question-answering
task_categories:multiple-choice
source_datasets:original
size_categories:10K<n<100K
license:cc-by-4.0
format:parquet
library:mlcroissant
library:polars
region:us
library:datasets
modality:text
language:en
language:hi
  • See Upvoters0
  • Downloads51
  • File Size0
  • Views296

BHARATGEN

Related Models Related Models

BharatGen - Param 1 Indic-Scale Bilingual Foundation Model
Param1 is a 2.9 billion parameter language model pretrained on English and Hindi, designed for text completion.
Large Language Model
  • See Upvoters4
  • Downloads708
  • File Size13.79 GB
  • Views20,245
Updated 1 month(s) ago

BHARATGEN

Param2-17B-Thinking
BharatGen presents Param-2-17B-MoE-A2.4B, a large-scale Mixture-of-Experts (MoE) language model designed to deliver high model capacity while retaining the inference efficiency of a much smaller dense model. It uses a Hybrid MoE architecture with 17B total parameters, while activating only 2.4B parameters per token.
Mixture of Experts
Multilingual Text
pretrained
  • See Upvoters1
  • Downloads60
  • File Size57.29 GB
  • Views2,021
Updated 3 month(s) ago

BHARATGEN

BharatGen - Param-1-7B-MoE Advancing Multilingual GenAI for India
Param-1-7B-MoE is a multilingual large language model developed under the Param-1 family as part of BharatGen – A Suite of Generative AI Technologies for India. With 7 billion parameters and a Mixture of Experts (MoE) architecture, the model is designed to better understand and generate text across English, Hindi, and 14 additional Indian languages. The model is pretrained from scratch with a strong focus on linguistic diversity, cultural context, and large-scale multilingual representation.
safetensors
region:us
mixtral
  • See Upvoters1
  • Downloads83
  • File Size0
  • Views1,394
Updated 5 month(s) ago

BHARATGEN