One of the most well-known features of Spotify is the ‘Discover Weekly’ playlist that magically curates 30 recommended songs every week for its 170 million customers. Within just 10 weeks of launching, over 1 billion songs were streamed from this automated playlist. So how does Spotify help you find music that matches so closely with your musical taste? As a curious PM, I take a peek under the hood of the Spotify recommendation system.
The Recommendation Model
It all starts with data collection. Spotify looks at all the music that you listen to, artists you search for, songs you skip within 30 seconds and type of songs you put in a playlist. Then they mix together a variety of best strategies to come up with the recommendation of songs.
This is the most popular method that is employed by most recommendation systems including Netflix and Facebook. Collaborative filtering is a method by which historical data is used to predict user preferences. So if two users listen to a set of common songs then it can be said that the musical taste of the two users is similar. Conversely, if two songs are listened to by a similar group of users then it can be said that the songs sound similar.
Therefore if a user has created a playlist of 20 songs and it turns out that you have heard 19 of them, then using collaborative filtering, Spotify knows to recommend you the one song you haven’t heard. This method works beautifully if enough usage data is collected. However, one of the biggest flaws here is that upcoming songs cannot be recommended since there will never be enough data to analyze them.
Natural Language Processing
To solve the problem with collaborative filtering, Spotify acquired Echo Nest in 2014, a platform that synthesizes billions of data points and transforms it into musical understanding. There is a lot of information that can be associated with music – lyrics, tags, artists, interviews, reviews from the web, etc. Spotify crawls the web and looks for all the information on the web that can be associated with a song. It buckets out reviews, ratings and the kind of language that is used to describe the song into “cultural vectors”.
These vectors are nothing but huge tables that have words and a score associated with them. These scores are then used to determine which songs are similar to each other. However, a new song that is not mentioned on the internet anywhere will have low scores initially and will take a long time to pop up in your recommendations.
To include completely new songs in your recommendations, Spotify analyzes the toughest piece of information i.e. raw audio signal. A convolution neural network is used to analyze the audio files to determine everything from loudness to tempo to beats. A list of output data can be found here. Understanding the songs’ key characteristics allows Spotify to find fundamental similarities between songs. It also helps Spotify understand a user’s taste profiles precisely by creating microgenres for these songs like synthpop(synthesizer as the
Bringing it all together
All these recommendation models connect to the entire Spotify ecosystem that ultimately