[PYTHON] I tried to analyze the negativeness of Nono Morikubo. [Compare with Posipa]

Look at this article and find out what is called COTOHA API. , It looked interesting so I decided to touch it. This time, I used COTOHA API to analyze the negativeness of Nono Morikubo.

environment

What is COTOHA API?

It is an API that can easily process natural language and voice. For example, you can perform parsing and speech recognition. Provided by NTT Communications.

What is Nono Morikubo?

One of the idols who appears in The Idolmaster Cinderella Girls, a 14-year-old girl. He is a home under the desk, sometimes running away from the producer, and sometimes working hard as an idol. She usually makes a lot of negative remarks, "I'm ...", but I'm trying to verify how negative it is.

Sentiment analysis

Click here for how to use Please note that there is an upper limit of 1000 calls per day for the free tier.

Example

For example, the initial N card "Ah ... Mmm ... I'm Morikubo ... Ah, yes, Nono Morikubo, but I'm sorry for the producer suddenly, but I'm thinking about quitting as an idol ... Ah, That…" Sentiment analysis of the line with the COTOHA API gives the following results.

{'result': {'sentiment': 'Negative', 'score': 0.48786837208987766, 'emotional_phrase': [{'form': 'I'm sorry', 'emotion': 'N'}]}, 'status': 0, 'message': 'OK'}

Since score comes out in the range of 0 to 1, this line can be said to be ** moderately negative **. There are three types of emotions: "Positive," "Neutral," and "Negative."

Practice

--The lines to be analyzed are the cards installed in Mobage version Idolmaster Cinderella Girls. There are 14 types of lines per card. --Omit the same lines before and after the special training. --The lines are scraped from the website. (Maybe the manual work was faster ...) --Calculate the number, percentage, and average score of each emotion. --Furthermore, let the ratio x average score be ** degree of emotion **. This is because ** the more positive lines and the higher the score, the more positive ** is considered. For example, if all 100% are positive lines and their average score is 1.0, then the degree is (positive, neutral, negative) = (1, 0, 0).

code

It is listed on Github.

result

Total number of lines: 410 Only one of these emotions was "Positive / Negative", so that is not included in the calculation.

Emotions Number of times Percentage Average score Degree
Positive 135 times 33% 0.428 0.141
Neutral 215 times 52% 0.342 0.179
Negative 59 times 14% 0.551 0.079

The average score of Negative is higher than others, but it is less frequent because it is less frequent. It was a little surprising result, but it can be said that she has grown up since the early days. In fact, when sentiment analysis was performed using only the initial cards, the results were as follows.

Total number of lines: 25

Emotions Number of times Percentage Average score Degree
Positive 6 times 24% 0.358 0.086
Neutral 12 times 48% 0.380 0.183
Negative 7 times 28% 0.580 0.163

Comparison

There are some parts that cannot be evaluated by the result of one person alone, so I will compare it with other idols. This time, I decided to analyze the emotions of positive passion. Positive Passion is the name of a unit consisting of Mio Honda, Akane Hino, and Aiko Takamori. If they are positive and passionate, the results will surely change.

Honda Mio

Total number of lines: 476

Emotions Number of times Percentage Average score Degree
Positive 199 times 41.8% 0.462 0.193
Neutral 265 times 55.7% 0.421 0.234
Negative 12 times 2.5% 0.442 0.011

Akane Hino

Total number of lines: 409

Emotions Number of times Percentage Average score Degree
Positive 154 times 37.7% 0.425 0.160
Neutral 228 times 55.7% 0.438 0.244
Negative 27 times 6.6% 0.394 0.026

Aiko Takamori

Total number of lines: 457 Only one of these emotions was "Positive / Negative", so that is not included in the calculation.

Emotions Number of times Percentage Average score Degree
Positive 263 times 57.5% 0.464 0.267
Neutral 172 times 37.6% 0.399 0.150
Negative 21 times 4.6% 0.478 0.022

Compared to the previous time, all three had lower negatives and higher positives. Even if you simply look at the number of times, you can see that there are few negative remarks. Sasuga Ginza.

The following is a summary of the degree.

name Positive degree Neutral degree Negative degree
Nono Morikubo 0.141 0.179 0.079
Honda Mio 0.193 0.234 0.011
Akane Hino 0.160 0.244 0.026
Aiko Takamori 0.267 0.150 0.022

When you look at it like this, it feels like there is a visible difference. It was surprising that Aiko had the highest degree of positiveness ... Well, it may be that the target lines were limited.

It would be interesting to analyze the data obtained this time using other methods, so I would like to try it again if I have the opportunity.

Summary

Nono Morikubo was negative after all, but he's growing better than he used to be!

Recommended Posts

I tried to analyze the negativeness of Nono Morikubo. [Compare with Posipa]
I tried to find the entropy of the image with python
I tried to find the average of the sequence with TensorFlow
I tried to compare the processing speed with dplyr of R and pandas of Python
I tried to expand the size of the logical volume with LVM
I tried to improve the efficiency of daily work with Python
I tried to analyze the data of the soccer FIFA World Cup Russia tournament with soccer action
I tried web scraping to analyze the lyrics.
I tried to save the data with discord
I tried to touch the API of ebay
I tried to get the authentication code of Qiita API with Python.
I tried to automatically extract the movements of PES players with software
(Python) I tried to analyze 1 million hands ~ I tried to estimate the number of AA ~
I tried to verify and analyze the acceleration of Python by Cython
Qiita Job I tried to analyze the job offer
I tried to streamline the standard role of new employees with Python
I tried to visualize the text of the novel "Weathering with You" with WordCloud
I tried to get the movie information of TMDb API with Python
I tried to predict the behavior of the new coronavirus with the SEIR model.
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
I tried to easily visualize the tweets of JAWS DAYS 2017 with Python + ELK
The story of making soracom_exporter (I tried to monitor SORACOM Air with Prometheus)
I tried to create a model with the sample of Amazon SageMaker Autopilot
I tried to automatically send the literature of the new coronavirus to LINE with Python
I tried to learn the sin function with chainer
I tried to extract features with SIFT of OpenCV
I tried to summarize the basic form of GPLVM
I tried to touch the CSV file with Python
I tried to visualize the spacha information of VTuber
I tried to erase the negative part of Meros
I tried to solve the problem with Python Vol.1
I tried to analyze J League data with Python
I tried to classify the voices of voice actors
I tried to summarize the string operations of Python
I tried to compare the accuracy of Japanese BERT and Japanese Distil BERT sentence classification with PyTorch & Introduction of BERT accuracy improvement technique
[Python & SQLite] I tried to analyze the expected value of a race with horses in the 1x win range ①
I tried to make something like a chatbot with the Seq2Seq model of TensorFlow
I tried to automate the article update of Livedoor blog with Python and selenium.
I tried to visualize the characteristics of new coronavirus infected person information with wordcloud
I tried to visualize the running data of the racing game (Assetto Corsa) with Plotly
The 15th offline real-time I tried to solve the problem of how to write with python
[Horse Racing] I tried to quantify the strength of racehorses
I tried "gamma correction" of the image with Python + OpenCV
I tried to simulate how the infection spreads with Python
I tried to get the location information of Odakyu Bus
I tried to notify the train delay information with LINE Notify
[Machine learning] I tried to summarize the theory of Adaboost
I tried to fight the Local Minimum of Goldstein-Price Function
I tried to divide the file into folders with Python
I'm tired of Python, so I tried to analyze the data with nehan (I want to go live even with corona sickness-Part 2)
I'm tired of Python, so I tried to analyze the data with nehan (I want to go live even with corona sickness-Part 1)
I tried to display the point cloud data DB of Shizuoka prefecture with Vue + Leaflet
I tried to automatically post to ChatWork at the time of deployment with fabric and ChatWork Api
I tried to rewrite the WEB server of the normal Linux programming 1st edition with C ++ 14
How to write offline real time I tried to solve the problem of F02 with Python
I tried to visualize the power consumption of my house with Nature Remo E lite
I tried to compare the accuracy of machine learning models using kaggle as a theme.
I tried to move ROS (Melodic) with the first Raspberry Pi (Stretch) at the beginning of 2021
I tried to get the number of days of the month holidays (Saturdays, Sundays, and holidays) with python
I wrote a doctest in "I tried to simulate the probability of a bingo game with Python"