[PYTHON] Get all songs of Arashi's song information using Spotify API and verify the index

2019/11/21 Partially changed as you pointed out.

Arashi started subscription distribution of all single songs on the occasion of the 20th anniversary of his debut, so we will use the Spotify API to get Arashi's song information.

While referring to this article [I tried to get music information and artist information with Spotify API](https://qiita.com/nochifuchi/items/29ac2664fc174a56c4b4#api%E3%82%AD%E3%83%BC%E3%81%AE% E5% 8F% 96% E5% BE% 97) I wrote about how to get the Spotify API before, so I will omit it this time. Get a list of songs in Python by specifying an artist with Spotify API

What information can I get with the Spotify API?

If you get music information using Spotify API, you can see that various indicators are included. Get Audio Features for Several Tracks

Details are described and verified in this article, so I referred to it. [The story that the information embedded in Spotify songs is not good] (https://note.mu/hkrrr_jp/n/n9925dce37cba)

Authentication

Use the acquired Client ID, Secret Client, and artist ID to acquire information.

First, install the Spotify API library spotipy.

pip install spotipy

Authenticate. Again, we'll use Spotify API's Python library, Spotipy. artist_id is carried out with Arashi's ID this time.

import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
import sys
import pprint
import pandas as pd

client_id = 'Obtained Client ID'
client_secret = 'Obtained Client Secret'
artist_id = 'Artist ID you want to get'

#Authenticate
client_credentials_manager = spotipy.oauth2.SpotifyClientCredentials(client_id, client_secret)
spotify = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

Get album information and track information

After getting the album ID, get the track ID for each album and get the music information. I just store it in the Dataframe. In the case of Arashi, the distribution target is a single song, but since there are two songs in one album because some of them are on both sides A, this process is performed.

#Get album information.
results = spotify.artist_albums(artist, album_type='single', country='JP', limit=50, offset=0)
#Get album information. Get more with the offset option.
results2 = spotify.artist_albums(artist, album_type='single', country='JP', limit=50, offset=50)

artist_albums = []
for song in results['items'][:len(results['items'])]:
    data = [
        song['name'], 
        song['id']]
    artist_albums.append(data)

for song in results2['items'][:len(results2['items'])]:
    data = [
        song['name'], 
        song['id']]
    artist_albums.append(data)


#Get track information from an album
song_info_track = []
for artist_album in artist_albums:
    album_id = artist_album[1]
    song_info = spotify.album_tracks(album_id, limit=50, offset=0)

    #Since there are both A sides, pull out the truck
    for song_info_detail in song_info['items'][:len(song_info)]:
        song_track_id = song_info_detail['id']
        song_title = song_info_detail['name']
        #Extract song information
        result = spotify.audio_features(song_track_id)
        
        #For some reason, there was a song that I couldn't get (I can't search by the ID I got), so I branch.
        if result[0] is not None:
            #Create a dictionary with a title
            result[0]['title'] = song_title
            pd.DataFrame(result)
            song_info_track.append(result[0])
        
df = pd.io.json.json_normalize(song_info_track)
df = df.set_index('title')

There are more than 50 songs, so I will omit them. I will write the details on another blog someday.

The parts that struggled are as follows. ** ・ Cannot get all songs ** Due to the specifications of Spotify API, it seems that you can only get the latest 50 albums at the maximum. This time, we have abandoned the early songs and acquired the latest songs from Turning Up to PIKA ☆ NCHI (as of November 19, 2019). 2019/11/21 I forcibly acquired all songs using offset.

** ・ Some songs may not be available for some reason ** When acquiring the track ID from the album and performing the music information, for some reason the music information could not be acquired with the acquired track ID. It worked well when I changed the date, but I was scared so I stored it in the Dataframe only when I could get it.

Verify

Delete unnecessary information for verification.

#Remove unnecessary columns for better visibility
#This time, delete the ID information and so on. Since it is not an instrument, delete instrumentalness as well.
df = df.drop(['type', 'id', 'uri', 'track_href','analysis_url', 'time_signature', 'mode','instrumentalness'], axis=1)

In addition to eliminating the need for IDs and URIs, I'm not familiar with beats and keys, so I deleted them. Since there is no instrumental song, instrumentalness is also deleted.

Of the acquired results, this time we will consider an index called dance a billty. Verification of each index is omitted in Qiita. I'll do it with otaku. The quotation is from Referenced site.

** ・ danceabillty **

Ease of dancing. The closer it is to 1, the more danceable it is. It seems to be decided by the tempo, rhythm, beat strength, etc.

Gets the top song (value close to 1) of each song. As an example, sort by dance a billty.

#sort
df.sort_values('danceability', inplace=True,ascending=False)
#Top 5 songs
df.head()

The top 5 songs as a result of dance a billty are as follows.

Title danceability
Turning Up 0.769
Face Down 0.733
Resurrection Love 0.719
While confused 0.713
A Day in Our Life 0.704

I don't know the top three songs, but I doubt that "while being confused" will be included ...

Next, let's look at the relationship with another index on the scatter plot. There are two indicators to be multiplied.

** ・ energy **

The radicalness of the song. I'm not sure if "radical" is a developmental thing or a momentum. There is an explanation that death metal is high and Bach's prelude is low.

** ・ valence **

Brightness. The closer it is to 1, the more positive the song is.

First of all, danceability x energy

df.plot.scatter('danceability','energy')

danceANDenergy.png

You can see that there is a lot of energy mainly in the line around 0.9. Is there a lot of intense songs basically? Certainly there may not be that many ballads. The only song with energy of 0.6 or less is a medium ballad called "Galaxy in the Eyes". (Song provided by Fumiya Fujii)

Next, danceability × valley

danceANDvalance.png

Is it relatively beautiful? It became a positive correlation. Is the dance song bright? The song at the bottom left is "Tomorrow's Memory". This is also a medium ballad.

Finally

Some people say that Spotify's indicators are different in terms of feeling, but I feel that they are roughly correct. I want to compare the characteristics of the songs between Johnny's groups, so please lift the ban on the distribution of songs to other groups as soon as possible! !! Especially J-Storm ~! !!

In the end! The link for Arashi's new song Turning Up is here! !! Even if I listen to a lot, it doesn't cost me 1 yen, but please listen! !! !!

Recommended Posts

Get all songs of Arashi's song information using Spotify API and verify the index
Get data using Ministry of Internal Affairs and Communications API
Get the weather using the API and let the Raspberry Pi speak!
I tried to get the index of the list using the enumerate function
How to get followers and followers from python using the Mastodon API
Get and set the value of the dropdown menu using Python and Selenium
Get news from three major mobile companies using Django and the News API
Get the number of articles accessed and likes with Qiita API + Python
I tried to get the movie information of TMDb API with Python
Get and estimate the shape of the head using Dlib and OpenCV with python
Get the minutes of the Diet via API
Create an application using the Spotify API
[Python] Get all comments using Youtube Data API
Use the MediaWiki API to get Wiki information
I tried to notify the update of "Become a novelist" using "IFTTT" and "Become a novelist API"
Get a list of GA accounts, properties, and views as vertical data using API
Save the text of all Evernote notes to SQLite using Beautiful Soup and SQLAlchemy
Get the title of yahoo news and analyze sentiment
[Rails] How to get location information using Geolocation API
[Python] Get the text of the law from the e-GOV Law API
Send and receive Gmail via the Gmail API using Python
Get comments and subscribers with the YouTube Data API
I tried using the API of the salmon data project
[Django 2.2] Sort and get the value of the relation destination
[Python] Get user information and article information with Qiita API
Get the trading price of virtual currency and create a chart with API of Zaif exchange