Bengali to Tamil Translation Benchmark Dataset

Bhashini's Bengali-Tamil Translation Benchmark is a detailed text dataset for testing machine translation quality. It includes document-level information and helps researchers build better multilingual translation systems.

About Dataset

The dataset NTREX_bn_ta_benchmark provides news test references for Machine Translation (MT) evaluation, focusing on translations from Bengali to Tamil. As part of a broader collection supporting translations into 128 target languages, this dataset includes document-level information, making it a valuable resource for multilingual MT benchmarking. Designed for the news domain, it enables comprehensive evaluation of translation quality and facilitates the development of effective translation systems. Submitted by Microsoft, this dataset is ideal for researchers and developers working on Bengali-to-Tamil translation tasks.