Notebookcheck Logo

Yandex releases Yambda open-source music recommendation dataset

The free Yahdex Yambda dataset allows anyone to create their own music recommendation service. (Image source: Yandex)
The free Yahdex Yambda dataset allows anyone to create their own music recommendation service. (Image source: Yandex)
Yandex has released Yambda, the world's largest open dataset for music recommender systems, containing 4.79 billion anonymized user interactions to help developers create smart music services that play only the songs listeners want to hear.

Yandex has released its open-source Yambda dataset containing information on music listener preferences for use in creating a streaming audio service similar to Spotify with AI-powered playlist personalization.

Streaming services like Spotify, Tidal, and Qobuz use software algorithms or AI models to create playlists based on individual preferences. These services typically do not release their code or models because their unique ability to automatically play songs listeners enjoy is considered a trade secret to their success.

Yandex has gathered data over ten months in the form of 4.79 billion user interactions with 9.39 million tracks of music from its pool of 28 million monthly Yandex Music users. This includes key feedback from Yandex Music listeners - what they choose to listen to as well as their likes and dislikes. All interactions are time stamped for increased precision.

The dataset can be downloaded in five billion (1 million users), five hundred million (100,000 users), and fifty million (10,000 users) event model sizes, with the maximum requiring at least 85 GB of storage space. The dataset is stored in the Apache Parquet format, a column-oriented data file format for convenient analysis and research.

Readers can give the gift of streaming music with a Spotify gift card.

Read all 2 comments / answer
static version load dynamic
Loading Comments
Comment on this article
Please share our article, every link counts!
Mail Logo
> Expert Reviews and News on Laptops, Smartphones and Tech Innovations > News > News Archive > Newsarchive 2025 05 > Yandex releases Yambda open-source music recommendation dataset
David Chien, 2025-05-29 (Update: 2025-05-29)