Yandex releases Yambda open-source music recommendation dataset

The free Yahdex Yambda dataset allows anyone to create their own music recommendation service. (Image source: Yandex)

Yandex has released Yambda, the world's largest open dataset for music recommender systems, containing 4.79 billion anonymized user interactions to help developers create smart music services that play only the songs listeners want to hear.

David Chien, Published 05/29/2025 🇪🇸 🇵🇹 ...

Yandex has released its open-source Yambda dataset containing information on music listener preferences for use in creating a streaming audio service similar to Spotify with AI-powered playlist personalization.

Streaming services like Spotify, Tidal, and Qobuz use software algorithms or AI models to create playlists based on individual preferences. These services typically do not release their code or models because their unique ability to automatically play songs listeners enjoy is considered a trade secret to their success.

Yandex has gathered data over ten months in the form of 4.79 billion user interactions with 9.39 million tracks of music from its pool of 28 million monthly Yandex Music users. This includes key feedback from Yandex Music listeners - what they choose to listen to as well as their likes and dislikes. All interactions are time stamped for increased precision.

The dataset can be downloaded in five billion (1 million users), five hundred million (100,000 users), and fifty million (10,000 users) event model sizes, with the maximum requiring at least 85 GB of storage space. The dataset is stored in the Apache Parquet format, a column-oriented data file format for convenient analysis and research.

Readers can give the gift of streaming music with a Spotify gift card.

Source(s)

Yambda at HuggingFace, Yandex press release

Read all 2 comments / answer

Loading Comments

Comment on this article

⟨

Anker launches new Prime Thunderbolt 5 240W Cable

Russian podcaster responsible for Switch 2 leaks and unboxing video isn't afraid of a Nintendo lawsuit

⟩

Add as a preferred source on Google

David Chien - Tech Writer - 1061 articles published on Notebookcheck since 2023

Having worked at Activision, UCLA, Anime Expo and more, I've seen technology being used to save lives, create games, and create fantastic 3D VR/AR worlds. There's always something fun in emerging technology that I want to get my hands on and all my friends turn to me to find the best for their needs, so I'm glad to bring my experience to Notebookcheck.

> Expert Reviews and News on Laptops, Smartphones and Tech Innovations > News > News Archive > Newsarchive 2025 05 > Yandex releases Yambda open-source music recommendation dataset

David Chien, 2025-05-29 (Update: 2025-05-29)

Source(s)

Related Articles