Crowdsourced speech dataset covering multiple languages for ASR research.
Mozilla Common Voice is a large, crowdsourced speech dataset containing voice recordings contributed by volunteers from around the world. It covers dozens of languages and accents, with each recording paired with a validated text transcription. The dataset is designed to be open, inclusive, and representative of diverse speakers and linguistic communities.
Common Voice Is Widely Used For Training And Evaluating Automatic Speech Recognition (Asr) Systems. Its Multilingual And Accent-diverse Nature Makes It Valuable For Building Inclusive Speech Technologies. Researchers Use It To Improve Speech Recognition Accuracy, Reduce Bias, And Develop Voice-enabled Applications Across Languages And Regions.
CC0 1.0 Public Domain
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.