[PYTHON] I want to express my feelings with the lyrics of Mr. Children

Introduction

The title is "What is this person saying ..?" (Laughs) .. I tried to make it using 4 consecutive holidays as well as studying natural language processing. In the near future, I'll make it available on the Web somewhere.

Concept and completed image

The moment I found this story, I wrote down the current situation (As is) → issues → what it should be (To be). As expected, a businessman (laughs) image.png

The following mechanism came up when I thought about how to make it. ↓ image.png Internally create Mr. Children lyrics dataset and convert to Word2Vec. Word2Vec processing is performed on my emotions as well, and similar lyrics are pulled by cos similarity.

PoC was done!

I tried quickly to realize the above overall picture. The result ... ** For my feelings that "I can't sleep" ** ** "Become a member of society and carry the burden on my back-to the person who shines light" ** is the first I came back to. Eh ... I read it so deeply ... lol image.png

What you are using

Morphological analysis: janome.tokenizer Word2Vec: word2vec in gensim.models

from janome.tokenizer import Tokenizer
from gensim.models import word2vec

The lyrics are shredded by morphological analysis, and Word2Vec is used for each word. Finally, by getting the average of the vectors, Word2Vec with one set of lyrics is completed.

image.png ↓ Results of morphological analysis image.png

Word2Vec part of the text

# skip-gram Mr.Children's lyrics(sentences)So, make a w2v model.
skipgram_model = word2vec.Word2Vec(sentences,
                                   sg=1,
                                   size=250,
                                   min_count=2,
                                   window=10, seed=1234)



#Do Word2Vec for each word that has been morphologically analyzed, and finally average the function => Can Word2Vec reflect the context of the lyrics?
def avg_document_vector(data, num_features):
    document_vec = np.zeros((len(data), num_features))
    for i, doc_word_list in enumerate(data):
        feature_vec = np.zeros((num_features,), dtype="float32")
        for word in doc_word_list:
            try:
                feature_vec = np.add(
                    feature_vec, skipgram_model.wv.__getitem__(word))
            except:
                pass

        feature_vec = np.divide(feature_vec, len(doc_word_list))
        document_vec[i] = feature_vec
    return document_vec

in conclusion

I found it interesting to convert words into vectors and see the degree of agreement. I want to study BERT as well. There is an urgent need to expand the number of songs in order to make this play a service. (As of July 29, 2020: 5 songs .. lol) I will continue to accumulate songs steadily.

Even so, I'm glad that I've become able to play this kind of play during the four consecutive holidays, as it seems that my skills are coming along! !!

Recommended Posts

I want to express my feelings with the lyrics of Mr. Children
I want to check the position of my face with OpenCV!
I want to stop the automatic deletion of the tmp area with RHEL7
I want to customize the appearance of zabbix
I tried to vectorize the lyrics of Hinatazaka46!
I made you to express the end of the IP address with L Chika
I want to use PyTorch to generate something like the lyrics of Japari Park
I want to grep the execution result of strace
I want to inherit to the back with python dataclass
I want to increase the security of ssh connections
I want to plot the location information of GTFS Realtime on Jupyter! (With balloon)
I tried to find the entropy of the image with python
I want to specify another version of Python with pyvenv
I tried to find the average of the sequence with TensorFlow
I want to use only the normalization process of SudachiPy
I want to change the Japanese flag to the Palau flag with Numpy
I want to color black-and-white photos of memories with GAN
[Python] I want to use the -h option with argparse
I want to judge the authenticity of the elements of numpy array
I want to know the features of Python and pip
Keras I want to get the output of any layer !!
I want to know the legend of the IT technology world
I tried to visualize the power consumption of my house with Nature Remo E lite
[Python] I want to make a 3D scatter plot of the epicenter with Cartopy + Matplotlib!
I want to do ○○ with Pandas
I want to debug with Python
I want to get the name of the function / method being executed
I want to manually assign the training parameters of the [Pytorch] model
I want to know the weather with LINE bot feat.Heroku + Python
Become familiar with (want to be) around the pipeline of spaCy
I want to read the html version of "OpenCV-Python Tutorials" OpenCV 3.1 version
[Introduction to StyleGAN] I played with "The Life of a Man" ♬
I tried to expand the size of the logical volume with LVM
For the time being, I want to convert files with ffmpeg !!
I want to know the population of each country in the world.
I tried to improve the efficiency of daily work with Python
I want to extract an arbitrary URL from the character string of the html source with python
[Twitter] I want to make the downloaded past tweets (of my account) into a beautiful CSV
I want to pin Spyder to the taskbar
I want to detect objects with OpenCV
I want to output to the console coolly
I want to handle the rhyme part1
I want to blog with Jupyter Notebook
I want to handle the rhyme part3
I want to pip install with PythonAnywhere
I want to analyze logs with Python
I want to play with aws with python
I want to display the progress bar
I want to handle the rhyme part2
I want to handle the rhyme part5
I want to handle the rhyme part4
I want to get angry with my mom when my memory is tight
[Note] I want to completely preprocess the data of the Titanic issue-Age version-
I tried to express sadness and joy with the stable marriage problem.
I don't want to admit it ... The dynamical representation of Neural Networks
I tried how to improve the accuracy of my own Neural Network
(Python Selenium) I want to check the settings of the download destination of WebDriver
I want to explain the abstract class (ABCmeta) of Python in detail.
I tried to get the authentication code of Qiita API with Python.
I want to sort a list in the order of other lists
I want to use the Django Debug Toolbar in my Ajax application