ORGANISATION

ParamBench

ParamBench is a graduate-level benchmark dataset for evaluating Large Language Models (LLMs) on India-centric subjects. It contains 17,275 Hindi MCQs across 21 disciplines from competitive exams, enabling assessment of subject knowledge, cultural understanding, and reasoning abilities. Paper: https://arxiv.org/pdf/2508.16185

About Dataset

ParamBench is a large-scale, graduate-level benchmark dataset designed to evaluate the performance of Large Language Models (LLMs) on India-centric subjects and culturally grounded knowledge. The dataset consists of 17,275 multiple-choice questions (MCQs) in Hindi, collected from Indian competitive examination papers and their corresponding answer keys. It spans 21 diverse academic subjects, including Anthropology, Sociology, History, Law, Political Science, Economics, Philosophy, and Indian Culture, providing broad coverage of humanities, social sciences, and domain-specific knowledge. Each data instance represents a single MCQ and includes a question in Hindi, four answer options (A-D), the correct answer label, and metadata such as subject, exam name, and question type. The dataset incorporates multiple question formats Normal MCQ, Assertion-Reason, Match the List, Ordering, Fill in the Blank, and Identify Incorrect Statement- enabling fine-grained evaluation of reasoning and analytical capabilities. All questions are preserved in Hindi, ensuring authentic evaluation of linguistic and cultural understanding without reliance on translation. The dataset is released as a single test split (17,275 samples) and is intended exclusively for evaluation, enabling standardized and reproducible benchmarking of LLMs across subjects and question types. Overall, ParamBench provides a comprehensive and challenging evaluation suite for measuring subject-wise knowledge, cultural awareness, and reasoning ability of modern language models in the Indian context.

Purpose of Dataset

The Primary Purpose Of Parambench Is To Provide A Rigorous, Standardized Benchmark For Evaluating Large Language Models (Llms) On India-centric Subjects, Addressing The Lack Of Culturally Grounded And Non-english Evaluation Datasets. It Enables Comprehensive Assessment Of Subject-specific Knowledge Across 21 Academic Disciplines, While Also Measuring Reasoning Abilities Through Diverse Question Formats Such As Assertion-reason, Match The List, Ordering, Fill In The Blank, And Identify Incorrect Statement. By Using Graduate-level Questions From Competitive Exams, The Dataset Evaluates Both Conceptual Understanding And Analytical Thinking. Parambench Also Aims To Assess Cultural And Contextual Understanding By Focusing On Indian Knowledge Domains Such As History, Philosophy, Law, And Culture, Which Are Often Underrepresented In Existing Benchmarks. By Preserving All Data In Hindi, It Ensures Authentic Evaluation Of Multilingual Capabilities Without Reliance On Translation. Additionally, The Dataset Supports Fine-grained Analysis Through Subject-wise And Question-type Performance, Helping Identify Weaknesses In Model Behavior. As A Test-only Benchmark With A Standardized Evaluation Setup, Parambench Enables Reproducible Comparisons And Provides Insights To Guide Future Research In Multilingual, Culturally Aware Ai Systems.

Dataset Metadata

License

Attribution 4.0 International (CC BY- 4.0)

Geographical coverage

India

Sector

Sector Agnostic

Author

vivekp

Source Organisation

BharatGen

Uploaded by

Vivekkumar Vasudevbhai Patel

Data Quality Score (Beta)

Dataset type

Structured

Frequency

Static

Time Granularity

Static

Year range

01/01/2012 - 01/01/2018

Date & Time

16/04/26 07:57:51

Visibility

Open

Primary Key / Indicator

Unique_question_id

Hosted / Redirected

Redirected

Data Type

Secondary

If Redirection which source

Https://huggingface.co/datasets/bharatgenai/parambench

Data Collection Method

Data Was Collected From Official Ugc-net (Nta) Language Examination Papers And Answer Keys Across Multiple Years And Sessions. Machine-readable Pdfs Were Directly Parsed, While Non-selectable Documents Were Processed Using Ocr Techniques. The Extracted Content Was Then Cleaned, Normalized, And Structured Into A Standardized Multiple-choice Question (Mcq) Format, Preserving The Original Language Scripts. Additional Annotations Such As Question Type And Metadata Were Added To Support Fine-grained Evaluation.

Activity Overview

0
0
4.53 MB
15

License Control

Attribution 4.0 International (CC BY- 4.0)

Select a file to preview its contents.

Data Quality Score Beta

Version Control

Version 1(4.53 MB)

Vivekkumar Vasudevbhai Patel· Today
- ParamBench.parquet

Related Datasets

Updated 6 month(s) ago

BharatGen : MHQA Dataset

MHQA: A Mental Health Solution for Healthcare

HealthCare

AI in healthcare

0
274
44.06 MB
3,886

BHARATGEN

View Details

Updated 6 month(s) ago

BhashaBench-Finance

BhashaBench-Finance (BBF): Benchmarking AI on Indian Financial Knowledge

library:pandas

language:en

modality:text

library:datasets

region:us

library:polars

library:mlcroissant

format:parquet

license:cc-by-4.0

size_categories:10K<n<100K

source_datasets:original

task_categories:multiple-choice

task_categories:question-answering

arxiv:2510.25409

language:hi

BHARATGEN

View Details

Updated 6 month(s) ago

BhashaBench-Krishi

BhashaBench-Krishi (BBK): Benchmarking AI on Indian Agricultural Knowledge

language:hi

arxiv:2510.25409

task_categories:question-answering

task_categories:multiple-choice

source_datasets:original

size_categories:10K<n<100K

license:cc-by-4.0

format:parquet

library:mlcroissant

library:polars

region:us

library:datasets

modality:text

language:en

library:pandas

BHARATGEN

View Details

Updated 6 month(s) ago

BhashaBench-Legal

BhashaBench-Legal (BBL): Benchmarking AI on Indian Legal Knowledge

library:pandas

language:hi

language:en

modality:text

library:datasets

region:us

library:polars

library:mlcroissant

format:parquet

license:cc-by-4.0

size_categories:10K<n<100K

source_datasets:original

task_categories:multiple-choice

task_categories:question-answering

arxiv:2510.25409

BHARATGEN

View Details

Updated 6 month(s) ago

BhashaBench-Ayur

BhashaBench-Ayur (BBA): Pioneering India’s Ayurvedic AI Benchmark

library:pandas

arxiv:2510.25409

task_categories:question-answering

task_categories:multiple-choice

source_datasets:original

size_categories:10K<n<100K

license:cc-by-4.0

format:parquet

library:mlcroissant

library:polars

region:us

library:datasets

modality:text

language:en

language:hi

BHARATGEN

View Details

Related Models

BharatGen - Param 1 Indic-Scale Bilingual Foundation Model

Param1 is a 2.9 billion parameter language model pretrained on English and Hindi, designed for text completion.

Large Language Model

4
708
13.79 GB
20,245

Updated 1 month(s) ago

BHARATGEN

View Details

Param2-17B-Thinking

BharatGen presents Param-2-17B-MoE-A2.4B, a large-scale Mixture-of-Experts (MoE) language model designed to deliver high model capacity while retaining the inference efficiency of a much smaller dense model. It uses a Hybrid MoE architecture with 17B total parameters, while activating only 2.4B parameters per token.

Mixture of Experts

Multilingual Text

pretrained

1
60
57.29 GB
2,021

Updated 3 month(s) ago

BHARATGEN

View Details

BharatGen - Param-1-7B-MoE Advancing Multilingual GenAI for India

Param-1-7B-MoE is a multilingual large language model developed under the Param-1 family as part of BharatGen – A Suite of Generative AI Technologies for India. With 7 billion parameters and a Mixture of Experts (MoE) architecture, the model is designed to better understand and generate text across English, Hindi, and 14 additional Indian languages. The model is pretrained from scratch with a strong focus on linguistic diversity, cultural context, and large-scale multilingual representation.

safetensors

region:us

mixtral

1
83
0
1,394

Updated 5 month(s) ago

BHARATGEN

View Details

Accessibility options by UX4G

ParamBench

About Dataset

Purpose of Dataset

Dataset Metadata

Activity Overview

Tags

License Control

Select a file to preview its contents.

Data Quality Score Beta

Version Control

Version 1(4.53 MB)

ParamBench.parquet

Related Datasets

Related Models

AIKosh

Resources

Support