[PYTHON] I tried to analyze my favorite singer (SHISHAMO) using Spotify API


Since I am studying data analysis, I thought about analyzing my favorite things, and I arrived at ** SHISHAMO **. When I was looking at Qiita to find out how to analyze it, I found an interesting article.

As a result of analyzing the attribute data of 76,000 songs of Spotify, J-Rock was Punk rather than Rock

Apparently, Spotify has attribute data for each song, and as a SHISHAMO fan, I'm wondering what kind of features SHISHAMO's attribute data has. So I analyzed SHISAMO using the attribute data of Spotify.


Let's leave the detailed explanation of Spotify API to others and briefly explain the specific steps.

1. Authentication work

import spotipy
import pandas as pd
client_id = 'client_id'
client_secret = 'client_secret'
client_credentials_manager = spotipy.oauth2.SpotifyClientCredentials(client_id, client_secret)
spotify = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

Issue cliend_id and client_secret from spotify for developers. I did it with reference to this article.

2. Data collection

albums = spotify.artist_albums(artist_id, album_type=None, country=None, limit=20, offset=0)
df = pd.DataFrame()
for i in range(len(albums['items'])):
    album_url = albums['items'][i]['external_urls']['spotify']
    album_name = albums['items'][i]['name']
    album_truck = spotify.album_tracks(album_url)['items']
    for j in range(len(album_truck)):
        truck_name = album_truck[j]['name']
        truck_url = album_truck[j]['external_urls']['spotify']
        truck = spotify.audio_features(truck_url)[0]
        tmp = pd.DataFrame(truck,index=['1',]).iloc[:,:11]
        tmp['album_name'] = album_name
        tmp['truck_name'] = truck_name
        df = df.append(tmp)

The artist_id could be obtained from Spotify's web player as follows. By the way, this background picture is drawn by the vocalist, which is very good and cute. スクリーンショット 2020-07-14 18.36.52.png I was able to get the following data like this. スクリーンショット 2020-07-14 18.41.07.png

Very cute song titles are lined up, and 11 variables are given on Spotify as features to represent that song. Here is the attribute data for each song on Spotify. For more information on this, please refer to this article. It seems that there is a tempo of the song, sound pressure, instrumental feeling, etc.

3. Data analysis

This time, we analyzed by dimensional compression & visualization using Umap. For details of Umap, please refer to this article. I hope you can think of it as an algorithm for projecting into a low-dimensional space suitable for clustering like PCA (I don't understand the details either). Here is the result of plotting the song data of all SHISHAMO 1 ~ 6 albums in two dimensions using Umap. スクリーンショット 2020-07-15 10.46.22.png It looks like there are 4 groups. Well clustered, for this result ** 1. Are there any characteristics of each album? ** ** 2. Are there any characteristics of popular songs? ** ** ** 3. Do your favorite songs have any characteristics? ** ** I tried to look at the three points.

1. Are there any characteristics for each album?

It is a graph plotted with red circles for each album. matome.jpeg I found that there were no songs in the 1st album that belonged to the group at the bottom left of the graph, and you can feel the growth of SHISHAMO, which has a wider range of songs. You can also see that the songs of each group are recorded in a well-balanced manner for each album. I don't like an album with only ballads, isn't it?

2. Are there any characteristics of popular songs?

I analyzed whether popular songs have characteristics. Since it is difficult to judge SHISHAMO's popular songs, I chose four songs that seemed to be well-known at my own discretion. ** "Tomorrow", "Kimi to Natsu Fes", "Love", "I have a girlfriend" **, I think fans will be satisfied. If you haven't heard it, please listen. popular (1).jpeg

I think they are gathered in the upper right, is it because there are many up-tempo songs? I was surprised that the two famous songs ** "Ashitamo" and "Kimi to Natsu Fes" ** were quite close. ** "Koisuru" ** is an up-tempo song that is often played at the end of live performances, and I was surprised because I thought it was the same cluster as ** "Kimi to Natsu Fes" **.

3. Are there any characteristics of your favorite song?

I analyzed whether my favorite song has a characteristic. I love all the songs and I can't tell the difference, so I chose one song from each album. ** "Midnight Radio" "Flowers" "Girls in the courtyard" "Tomorrow" "My dawn" "I'll forget you" ** 6 songs, I intend to select songs with a wide range is. myselec.jpeg I feel that this is also gathered in the upper right, ** "flowers" ** are completely separated. The song "My Dawn" ** has a different atmosphere from ** "Ashitamo" **, and it has a dark atmosphere, but it's in the upper right corner. I thought it wasn't very similar, but it may be a similar song.


I was surprised that the songs that were selling and my favorite songs were unexpectedly biased. This time, I analyzed with my favorite singer, but I did the same analysis with an artist who has a favorite song but doesn't know much about other songs, and searched for a recommended song for me. I thought it would be interesting to try </ font>. I think there are songs that the same artist is addicted to and songs that are not addicted to, so it would be fun if we could classify those songs.


We looked at the contribution rate of each principal component in PCA. ![Screenshot 2020-07-15 11.42.17.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/666603/2752c565-83b6-3d41-39e0- 483b249d1a16.png) You can see that the first principal component is mostly determined by the tempo, and the second principal component is mostly determined by the key. </ font> The two axes of Umap are not exactly the same, but I think it's likely that you're seeing something similar. I was a little disappointed to see this at the end of the analysis. ..

Recommended Posts