[PYTHON] [SLAYER] I visualized the lyrics of thrash metal and checked the soul of steel [Word Cloud]

Introduction

My beloved Thrash Metal, ** SLAYER ** is my favorite.

Although they have been active for many years, they have finally reached the final world tour while overcoming the death of the members. And on November 30, 2019, the final lecture in LA graced the endless beauty.

https://www.youtube.com/watch?v=OwsdbuxRc_s

To commemorate this, I would like to confirm what they wanted to convey.

environment

Visualization result

wordcloud.png

After all blood is a death ... I have nothing to say anymore.

Thanks for the great music and message! !! !! !! !!

Advance preparation

First, check the HTML of the target page. The lyrics are written in the "lyrics" class, but there are some extra child tags that need to be eliminated.

sample tags


<div class="lyrics">
<h3><a name="1">1. Evil Has No Boundaries</a></h3><br />
<i>[Lyrics - Hanneman, King; Music - King]</i><br />
<br />
Blasting our way through the boundaries of Hell<br />

・ ・ ・

We conquer then move on ahead<br />
<br />
<i>[Chorus:]</i><br />
Evil<br />

・ ・ ・

Your soul now his to keep<br />
<br />

<div class="thanks">Tom Araya     - Bass/Vocals<br />
Kerry King    - Lead/Rhythm Guitar<br />
Jeff Hanneman - Lead/Rhythm Guitar<br />
Dave Lombardo - Drums<br />
<br />
Thanks to nwdrk13 for correcting track #4 lyrics.<br />
Thanks to rath00 for correcting track #6 lyrics.</div>
<br />
<div class="note">Submits, comments, corrections are welcomed at [email protected]</div><br />
<a href="http://www.darklyrics.com/s/slayer.html">SLAYER LYRICS</a>
</div>

Source code

It turned out to be something like this.

import requests
from bs4 import BeautifulSoup
import pandas as pd
import time
from wordcloud import WordCloud

#URL list for each album
urls = ['http://www.darklyrics.com/lyrics/slayer/shownomercy.html',
        'http://www.darklyrics.com/lyrics/slayer/hauntingthechapel.html',
        'http://www.darklyrics.com/lyrics/slayer/hellawaits.html',
        'http://www.darklyrics.com/lyrics/slayer/reigninblood.html',
        'http://www.darklyrics.com/lyrics/slayer/southofheaven.html',
        'http://www.darklyrics.com/lyrics/slayer/seasonsintheabyss.html',
        'http://www.darklyrics.com/lyrics/slayer/divineintervention.html',
        'http://www.darklyrics.com/lyrics/slayer/undisputedattitude.html',
        'http://www.darklyrics.com/lyrics/slayer/diabolusinmusica.html',
        'http://www.darklyrics.com/lyrics/slayer/godhatesusall.html',
        'http://www.darklyrics.com/lyrics/slayer/christillusion.html',
        'http://www.darklyrics.com/lyrics/slayer/worldpaintedblood.html',
        'http://www.darklyrics.com/lyrics/slayer/repentless.html']

texts = ''

for url in urls:
    #Get
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'lxml')
    song_lyrics = soup.find('div', class_='lyrics')
    
    #Delete unnecessary tags
    for tag in song_lyrics.find_all('h3'):
        tag.extract()
    for tag in song_lyrics.find_all('i'):
        tag.extract()
    for tag in song_lyrics.find_all('div', class_='thanks'):
        tag.extract()
    for tag in song_lyrics.find_all('div', class_='note'):
        tag.extract()
    for tag in song_lyrics.find_all('a'):
        tag.extract()
        
    song_lyric = song_lyrics.text
    song_lyric = song_lyric.replace('\n',' ')
    
    #Wait for 1 second (considering server load)
    time.sleep(1)

    #Add the acquired lyrics
    texts = texts + song_lyric + ' '
    
#Word removal that seems meaningless
stop_words = ['the', 'of', 'to', 'is', 'in', 'for', 'with', 'that', 'my', 'all', 'will', 'from', 'can', 'your',  
              'on', 'me', 'it', 'and', 'this', 'be', 'are', 'am', 'their', 'do', 'there', 'you', 'it']

wordcloud = WordCloud(background_color='black', colormap='autumn',
    width=800, height=600, stopwords=set(stop_words)).generate(texts)

#The image is wordcloud.Save png in the same directory as the py file
wordcloud.to_file('./wordcloud.png')

in conclusion

From "Show No Mercy" in 1983, the legendary album "Reign in Blood" in 1986. And the last original album, 2015 "Repentless".

No other band has been so musical in their lifetime. They were very rare.

I will hold their sound and the steel soul in my heart for the rest of my life.

Ah ... it wasn't a blog here.

reference

Recommended Posts

[SLAYER] I visualized the lyrics of thrash metal and checked the soul of steel [Word Cloud]
I checked out the versions of Blender and Python
I checked the default OS and shell of docker-machine
[Flask & Bootstrap] Visualize the content of lyrics in Word Cloud ~ Lyrics Word Cloud ~
Since it is the 20th anniversary of the formation, I tried to visualize the lyrics of Perfume with Word Cloud
I vectorized the chord of the song with word2vec and visualized it with t-SNE
I checked the number of closed and opened stores nationwide by Corona
I checked the contents of docker volume
I checked the options of copyMakeBorder of OpenCV
I checked the list of shortcut keys of Jupyter
I checked the session retention period of django
I checked the processing speed of numpy one-dimensionalization
I read and implemented the Variants of UKR
The nice and regrettable parts of Cloud Datalab
I tried to vectorize the lyrics of Hinatazaka46!
I checked the output specifications of PyTorch's Bidirectional LSTM
I analyzed the rank battle data of Pokemon sword shield and visualized it on Tableau