I tried to analyze the New Year's card by myself using python

Motivation for development

New Year's cards sent every year ... When I was organizing my room, I got a lot of New Year's cards from my seniors and juniors. I wondered if this could be useful.

Then I came up with the idea that I could visualize myself as seen by others through the New Year's card. I wondered if so-called self-analysis could be done through New Year's cards.

Come to think of it, when I write to another person, I write with that person's impressions and episodes from last year. I wondered if this was the same for others.

You should be able to extract your impression of yourself by morphologically analyzing the New Year's card ... I wanted to make it a word cloud and visualize my impression of others.

How to make

Type in the text of the 1st New Year's card (data entry)

First, we need to collect the data to be analyzed, so we will summarize the contents of the New Year's card in Excel.

messageImage_1585488373040.jpg

Like this, I entered it without the ridiculous greetings of Happy New Year and Kotoyoro. As much as possible, I tried to enter only words related to my impressions and episodes.

2 Collect data

Next, combine the entered Excel into one data

python


import xlrd

wb = xlrd.open_workbook('/nenga2020.xlsx')
sheet = wb.sheet_by_name('Sheet1')
col_values = sheet.col_values(0)
text=""
for i in col_values:
    text=text+i
print(text)

This means that the text contains all the text of the New Year's card.

3 Morphological analysis & word cloud creation

Finally, morphological analysis and word cloud creation are done from here.


import MeCab
import wordcloud, codecs

m = MeCab.Tagger("")
text = text.replace('\r', '')
parsed = m.parse(text)

splitted = ' '.join(
    [x.split('\t')[0] for x in parsed.splitlines()[:-1] if x.split('\t')[1].split(',')[0]  in ["noun","adjective","Adjectival noun"] ])

wordc = wordcloud.WordCloud(font_path='HGRGM.TTC',
                            background_color='white',
                            contour_color='steelblue',
                            contour_width=2).generate(splitted)
wordc.to_file('nenga2020.png')

This will solve the impression written on the New Year's card.

python


splitted = ' '.join(
    [x.split('\t')[0] for x in parsed.splitlines()[:-1] if x.split('\t')[1].split(',')[0]  in ["noun","adjective","Adjectival noun"] ])

By the way, the part of speech is narrowed down to nouns, adjectives, and adjective verbs. This is because the purpose is to extract my impression.

Summary

Whole code

import xlrd
import MeCab
import wordcloud, codecs


wb = xlrd.open_workbook('/nenga2020.xlsx')
sheet = wb.sheet_by_name('Sheet1')
col_values = sheet.col_values(0)
text=""
for i in col_values:
    text=text+i

m = MeCab.Tagger("")
text = text.replace('\r', '')
parsed = m.parse(text)

splitted = ' '.join(
    [x.split('\t')[0] for x in parsed.splitlines()[:-1] if x.split('\t')[1].split(',')[0]  in ["noun","adjective","Adjectival noun"] ])

wordc = wordcloud.WordCloud(font_path='HGRGM.TTC',
                            background_color='white',
                            contour_color='steelblue',
                            contour_width=2).generate(splitted)
wordc.to_file('nenga2020.png')

And here is the completed word cloud for self-analysis. nenga2020_qujita.png

I got a lot of New Year's cards from the Kendo club, so there are many related words ...

Words such as funny, competent, and respect are considered to be one's impressions from others.

Recommended Posts

I tried to analyze the New Year's card by myself using python
I tried using the Datetime module by Python
I tried web scraping to analyze the lyrics.
[Python] I tried to analyze the pitcher who achieved no hit no run
Qiita Job I tried to analyze the job offer
I tried to streamline the standard role of new employees with Python
A super introduction to Django by Python beginners! Part 3 I tried using the template file inheritance function
A super introduction to Django by Python beginners! Part 2 I tried using the convenient functions of the template
I tried to get and analyze the statistical data of the new corona with Python: Data of Johns Hopkins University
[Python] I tried to judge the member image of the idol group using Keras
I tried using the Python library "pykakasi" that can convert kanji to romaji.
I tried to predict the change in snowfall for 2 years by machine learning
I tried to automatically send the literature of the new coronavirus to LINE with Python
I tried to graph the packages installed in Python
I tried to touch the CSV file with Python
I tried to solve the soma cube with python
[Python] I tried to graph the top 10 eyeshadow rankings
I tried to solve the problem with Python Vol.1
I tried to analyze J League data with Python
I tried to identify the language using CNN + Melspectogram
I tried to access Google Spread Sheets using Python
I tried to complement the knowledge graph using OpenKE
I tried to compress the image using machine learning
I tried to summarize the string operations of Python
[Python] I tried using OpenPose
[Introduction] I tried to implement it by myself while explaining the binary search tree.
Python practice 100 knocks I tried to visualize the decision tree of Chapter 5 using graphviz
[Introduction] I tried to implement it by myself while explaining to understand the binary tree
I tried to find the entropy of the image with python
I tried to simulate how the infection spreads with Python
I tried to analyze the whole novel "Weathering with You" ☔️
I tried using the Python library from Ruby with PyCall
I tried to simulate ad optimization using the bandit algorithm.
[Python] I tried to visualize the follow relationship of Twitter
I tried to implement the mail sending function in Python
[TF] I tried to visualize the learning result using Tensorboard
Miscellaneous notes that I tried using python for the matter
[Python] I tried collecting data using the API of wikipedia
I tried to enumerate the differences between java and python
I tried to make a stopwatch using tkinter in python
I tried changing the python script from 2.7.11 to 3.6.0 on windows10
I tried to analyze scRNA-seq data using Topological Data Analysis (TDA)
I tried to divide the file into folders with Python
I tried to approximate the sin function using chainer (re-challenge)
I tried to implement blackjack of card game in Python
I tried to output the access log to the server using Node.js
A super introduction to Django by Python beginners! Part 6 I tried to implement the login function
I tried to touch Python (installation)
I tried to create a RESTful API by connecting the explosive Python framework FastAPI to MySQL.
I tried to tabulate the number of deaths per capita of COVID-19 (new coronavirus) by country
I tried using Thonny (Python / IDE)
I tried to predict the infection of new pneumonia using the SIR model: ☓ Wuhan edition ○ Hubei edition
I tried to move the ball
I tried using the checkio API
[Python] I tried using YOLO v3
I tried to estimate the interval.
I tried to solve the ant book beginner's edition with python
I tried to get the index of the list using the enumerate function
I tried to make a regular expression of "amount" using Python
I tried to visualize the Beverage Preference Dataset by tensor decomposition.
I tried to make a regular expression of "date" using Python