[PYTHON] I had Spotify analyze the difference between the Beatles song and my own song and plotted it

Previous theory

The Beatles are still popular after 50 years

(Assuming high quality)

I think it's a career length that is easy to understand.

About 200 songs in about 10 years, a huge number of songs that can not be kept up.

It's perfect for analysis (brute force switching).

Talk about yourself

By the way, I'm also making music, and it's also on Spotify,

The source of the vague motive that you might understand something if you compare your own song with the Beatles song

I ran Python.

Spotipy

But I didn't analyze it in Python.

I just fetched the data that Spotify is analyzing (for recommendations?) Through the API.

There is also a library for fetching, called Spotipy.

https://spotipy.readthedocs.io/en/2.16.1/

Authentication information

Create an APP from the Spotify Developer Portal.

Make a note of the client ID and Secret at this time.

For Spotipy, I'd like you to use environment variables, but it didn't work, so I pass the ID and Secret with ** direct **.

import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
spotify = spotipy.Spotify(client_credentials_manager=SpotifyClientCredentials(
    client_id='hogehoge', client_secret='fugafuga'))

Now you can hit Spotify's API ... or rather, it's just like fetching it through Spotipy.

I will get it.

By the way, I found Playlist of 213 Beatles songs, so I used it.


beatles_uri = 'spotify:playlist:57z71mWZeq5xy0zEBvO5Cx'
results1 = spotify.playlist_items(beatles_uri)
songs1 = results1["items"]

results2 = spotify.playlist_items(beatles_uri, offset=100)
songs2 = results2["items"]

results3 = spotify.playlist_items(beatles_uri, offset=200)
songs3 = results3["items"]

songs1.extend(songs2)
songs1.extend(songs3)
songs = songs1

There are 213 songs, but it seems that up to 100 songs can be taken at the same time, so I divided them into three.

for song in songs:
    ana = spotify.audio_features(song["track"]["uri"])

audio_features is a different API from playlist_items.

The analysis information is stored here.

There are various things included, but this time I will examine the "key" of the song.

Since it's 213 songs, I put time.sleep (1) to prevent the load from being applied.

    csvRow = f'"{i}", "{song["track"]["name"].replace(",", " ")}", "{keymode[ana[0]["key"]][ana[0]["mode"]]}"'
    with open('beatles.csv', 'a') as f:
        print(csvRow, file=f)

The rest is writing to a file.

By the way, keymode is because the key (A or B) and the mode (major, minor) are in different places in Spotify's analysis.

This is a self-made list to map well.

keymode = [
    ["Cm", "C"],
    ["C#m", "Db"],
    ["Dm", "D"],
    ["D#m", "Eb"],
    ["Em", "E"],
    ["Fm", "F"],
    ["F#m", "Gb"],
    ["Gm", "G"],
    ["G#m", "Ab"],
    ["Am", "A"],
    ["Bbm", "Bb"],
    ["Bm", "B"]
]

Digression 1: I was wondering if it should be ** Bbm ** or ** A # m **, but maybe Bbm is common? Wonder?

Digression 2: I wanted to use ♭ instead of b, but I gave up because of garbled characters.

csvmod.py

Now you have beatles.csv.

I also made mikiri.csv in much the same way.

However, when it came time to plot, it turned out that it was easier to set the two if they were in the same CSV!

So it's a merge.


import pandas as pd

beatles_csv = pd.read_csv("beatles.csv")
mikiri_csv = pd.read_csv("mikiri.csv")

beatles_csv["artist"] = '"beatles"'
mikiri_csv["artist"] = '"mikirihassha p"'


beatles_csv.to_csv("analyze.csv", index=False,
                   columns=["name", "key", "artist"])
mikiri_csv.to_csv("analyze.csv", mode="a", header=False,
                  index=False, columns=["name", "key", "artist"])

Create a data frame from CSV using read_csv in pandas.

Added new column "artist" to the data frame.

Exhale with to_csv ().

However, the second is that header is False and mode is a.

It's like this!

I don't want to show the CSV that I made, but I will show it.


name,key,artist
" ""Love Me Do - Mono / Remastered"""," ""C""","""beatles"""
" ""P.S. I Love You - Remastered"""," ""D""","""beatles"""
" ""Please Please Me - Remastered"""," ""E""","""beatles"""
" ""Ask Me Why - Remastered"""," ""E""","""beatles"""
" ""I Saw Her Standing There - Remastered"""," ""E""","""beatles"""
" ""Misery - Remastered"""," ""C""","""beatles"""
" ""Anna (Go To Him) - Remastered"""," ""D""","""beatles"""
" ""Chains - Remastered"""," ""Bb""","""beatles"""
" ""Boys - Remastered"""," ""E""","""beatles"""
" ""Baby It's You - Remastered"""," ""Em""","""beatles"""
" ""Do You Want To Know A Secret - Remastered"""," ""E""","""beatles"""
" ""A Taste Of Honey - Remastered"""," ""C#m""","""beatles"""
" ""There's A Place - Remastered"""," ""E""","""beatles"""
" ""Twist And Shout - Remastered"""," ""D""","""beatles"""

Somehow, there were too many quotation marks and it became complicated.

However, I will go as it is.

plot.py

Next is the plot.


import matplotlib.pyplot as plt
import pandas as pd

csv = pd.read_csv( "analyze.csv" )

csv = csv.sort_values("key")

csv["key"].hist(bins=50, by=csv["artist"], sharey=True)
plt.show()

Only this!

The value passed to csv [" key "] .hist () was a bit of trial and error, but the rest was straightforward.

If you specify by = column, you can draw a plot to compare by the value of that column.

In other words, it looks like this.

Figure_sorted.png

It seems that the data looks good.

analysis

Singer-songwriter guitarist

When you look at the graph of beatles, you immediately notice

It means that these guys are making songs on the guitar.

The prominent A, C, D, E, and G are chords that have a form that is easy to hold on a guitar.

In other words, I think it's composed with low chords, not with barre chords (the ones that grab the strings).

After that, I may change it to a barre chord when playing ...

Neaka

The second thing to notice is that there are many major keys and few minor keys.

This is understandable from the image of The Beatles.

Also, I personally felt that the Beatles had many D keys, but it seems that there were actually many.

Comparison

mikirihassha p (myself) had a peak in G, but there were not many waves, and the tendency was difficult to grasp.

In a nutshell, it's a well-balanced graph.

It feels like what happens when you balance the tones ...

It's a little surprising that there were more Abs than A.

Also, D, which was a lot in the Beatles, is less in my own work.

Conclusion

** Not very helpful **

I think everyone will raise or lower the sound in semitone units at karaoke,

It doesn't make a different song just because it's a semitone up.

Similarly, it is unlikely that you can become the Beatles immediately, for example, just because you set the key to D.

However, I felt that there were too few D-key songs, so I thought I'd try to increase them a little in the future.

Recommended Posts

I had Spotify analyze the difference between the Beatles song and my own song and plotted it
I investigated the behavior of the difference between hard links and symbolic links
I vectorized the chord of the song with word2vec and visualized it with t-SNE
What is the difference between `pip` and` conda`?
About the difference between "==" and "is" in python
About the difference between PostgreSQL su and sudo
What is the difference between Unix and Linux?
Consideration of the difference between ROC curve and PR curve
The rough difference between Unicode and UTF-8 (and their friends)
Can BERT tell the difference between "candy (candy)" and "candy (rain)"?
What is the difference between usleep, nanosleep and clock_nanosleep?
I examined the data mapping between ArangoDB and Java
How to use argparse and the difference between optparse
I want to absorb the difference between the for statement on the Python + numpy matrix and the Julia for statement