COMMUNITY

musdb18hq

MusDB-HQ Dataset for Music Source Separation

About Dataset

The MUSDB18-HQ dataset is a high-fidelity, uncompressed version of the standard MUSDB18 dataset, specifically curated for evaluation and training of automated music source separation algorithms. It consists of 150 full-length multi-track songs (comprising 100 training tracks and 50 test tracks) encoded in pristine 44.1kHz WAV format. Every track contains separate, time-synchronized stems for four target instruments: vocals, drums, bass, and others, along with the complete full-mix audio track.

Purpose of Dataset

The Purpose Of This Dataset Is To Train And Validate Deep Learning Architectures (Such As U-nets And Roformer Transformers) For State-of-the-art Vocal And Multi-instrument Stem Separation. Because It Bypasses The Audio Compression Artifacts Present In Standard Datasets, It Serves As The Foundational Benchmark For Optimizing High-fidelity Neural Audio Engineering, Professional Stem Extraction Systems, And Digital Signal Processing Pipelines.