AudioSet YouTube Metadata for Sounds

Over 2 million labeled sound clips, but actual audio must be extracted from YouTube.

About Dataset

AudioSet is a large-scale ontology and dataset of labeled audio events developed by Google Research. It consists of millions of audio clips extracted from YouTube videos and annotated with a hierarchical taxonomy of sound events. The dataset covers a wide variety of everyday sounds, music, speech, and environmental noise, providing a comprehensive view of real-world audio content.

Purpose of Dataset

Audioset Is Used For Training And Evaluating Audio Event Detection And Sound Classification Models. It Supports Research In Acoustic Scene Understanding, Multimedia Analysis, And Multimodal Learning. Researchers Use Audioset To Build Systems That Recognize Complex Sound Patterns And Understand Audio Content In Diverse Real-world Scenarios.