I'm tired of Python, so I analyzed the data with nehan (corona related, is that word now?)

greeting

Hello, this is sunfish. Data analysis using Python has become popular these days, but it is difficult to master. The goal is to struggle with Python, and the business improvement that I originally wanted to achieve is here. .. .. I would like to introduce an example of analyzing data using the GUI tool "nehan" to solve such problems.

Looking back on the number of occurrences of a specific word from Twitter data

More than half a year has passed since the coronavirus became a social problem. Let's follow the number of occurrences of that word from the tweet data for the past two months.

data

nehan can directly import Twitter data, and this time I used that function. I will introduce it later. Every day from July 27, 2020 ** 3,000 tweets including "Corona" in the tweet text are accumulated and data for about 2 months is prepared. Click here for details of the data (https://sunfish.nehan.io/datasources_v2/3424) スクリーンショット 2020-09-28 19.13.12.png

Preprocessing

1. Select columns only for Text and Created_At to be used
port_2 = port_1[['Created_At', 'Text']]

スクリーンショット 2020-09-28 19.14.05.png

2. Change Created_At to date type
port_3 = port_2.copy()
port_3['Created_At'] = pd.to_datetime(
    port_3['Created_At'], errors='coerce', foramt=None)
port_3['Created_At'] = port_3['Created_At'].map(lambda x: x.date())

スクリーンショット 2020-09-28 19.14.16.png

3. Created_At, which cannot be changed to date type, is a missing value, so delete each row.
port_4 = port_3.copy()
port_4 = port_4.dropna(subset=None, how='any')

スクリーンショット 2020-09-28 19.14.31.png

Aggregate the number of words by day

4. Filter to tweets containing specific words
port_5 = port_4[(port_4['Text'].str.contains('cluster', na=False, regex=False))]

スクリーンショット 2020-09-28 19.14.53.png

5. Aggregate daily
port_9 = port_5.copy()
port_9 = port_9.groupby(['Created_At']).agg(
    {'Created_At': ['size']}).reset_index()
port_9.columns = ['Created_At', 'Line count']

スクリーンショット 2020-09-28 19.15.09.png

Visualize and consider

Cluster

スクリーンショット 2020-09-28 19.17.44.png The word "cluster" is widely recognized as a symbol of explosive infection. The reason why it flew on 8/9 is probably due to the [Cluster Festival] held in Shibuya (https://news.yahoo.co.jp/articles/76e47dc2ce6608e018fe37bc92be296e381f76fa?page=1).

[Abenomask](https://sunfish.nehan.io/projects/d2b98c5d-ef62-476d-81a5-f7ffff5c4ce7/nodes/node_6LbZiiiO7U569CmOj2hZ/visualize/xzmYA2dBkJKvONwXpYEy8W5PoD

スクリーンショット 2020-09-28 19.17.57.png I also looked at this word, which made me feel nostalgic.

Self-restraint

スクリーンショット 2020-09-28 19.18.11.png A new lifestyle is taking root, but it seems that the self-restraint mood is not completely over. It looks like it is gradually decreasing.

Summary

In order to get an exact result, I really have to do more pre-processing, but I tried to process the data simply for a rough observation and an introduction to nehan. In addition, the above source code is a copy of the code output by nehan's python export function.

Recommended Posts

I'm tired of Python, so I analyzed the data with nehan (corona related, is that word now?)
I'm tired of Python, so I tried to analyze the data with nehan (I want to go live even with corona sickness-Part 2)
I'm tired of Python, so I tried to analyze the data with nehan (I want to go live even with corona sickness-Part 1)
I tried to get and analyze the statistical data of the new corona with Python: Data of Johns Hopkins University
Python practice data analysis Summary of learning that I hit about 10 with 100 knocks
[Python Data Frame] When the value is empty, fill it with the value of another column.
I tried to put out the frequent word ranking of LINE talk with Python
Here is one of the apps with "artificial intelligence" that I was interested in.
[New Corona] Is the next peak in December? I tried trend analysis with Python!
[Python & SQLite] I analyzed the expected value of a race with horses with a win of 1x ②
I thought about why Python self is necessary with the feeling of a Python interpreter
A memo that I touched the Datastore with python
Extract the band information of raster data with python
I tried to find the entropy of the image with python
Try scraping the data of COVID-19 in Tokyo with Python
I tried "gamma correction" of the image with Python + OpenCV
I wrote the basic grammar of Python with Jupyter Lab
I evaluated the strategy of stock system trading with Python.
The story of rubyist struggling with python :: Dict data with pycall
[Homology] Count the number of holes in data with Python
[Python] I tried collecting data using the API of wikipedia
I passed the Python data analysis test, so I summarized the points
I want to output while converting the value of the type (e.g. datetime) that is not supported when outputting json with python
I made something with python that NOW LOADING moves from left to right on the terminal
I bought and analyzed the year-end jumbo lottery with Python that can be executed in Colaboratory
The latest NGINX is an application server! ?? I measured the benchmark of NGINX Unit with PHP, Python, Go! !!
I tried scraping the ranking of Qiita Advent Calendar with Python
March 14th is Pi Day. The story of calculating pi with python
I want to output the beginning of the next month with Python
Visualize the frequency of word occurrences in sentences with Word Cloud. [Python]
[Super basics of Python] I learned the basics of the basics, so I summarized it briefly.
I tried to improve the efficiency of daily work with Python
The story of making a module that skips mail with python
[Python] Whiten the parts that turn black when there is no data in the Choropleth map of Folium.
I tried to open the latest data of the Excel file managed by date in the folder with Python
Since it is the 20th anniversary of the formation, I tried to visualize the lyrics of Perfume with Word Cloud
I liked the tweet with python. ..
I replaced the numerical calculation of Python with Rust and compared the speed
Try to image the elevation data of the Geographical Survey Institute with Python
I tried to get the authentication code of Qiita API with Python.
I have 0 years of programming experience and challenge data processing with python
I made a GAN with Keras, so I made a video of the learning process.
I tried to streamline the standard role of new employees with Python
The result of making the first thing that works with Python (image recognition)
Find out the name of the method that called it from the method that is python
I tried to get the movie information of TMDb API with Python
[Introduction to Python] What is the method of repeating with the continue statement?
I measured the speed of list comprehension, for and while with python2.7.
[Python] I created an app that automatically downloads the audio file of each word used for the English study app.
Python> set> Convert with set ()> dictionary is only key> I was taught how to convert the values of dictionary to set / dir ({}) / help ({}) / help ({} .values)
I'm an amateur on the 14th day of python, but I want to try machine learning with scikit-learn