[New Corona] Is the next peak in December? I tried trend analysis with Python!

Trend.png

0. Summary

(1) I tried trend analysis of the new corona with Python's Stats Model (2) Analysis of data from the Ministry of Health, Labor and Welfare revealed that the first peak was in April and this peak was in August. (3) That means that the next peak will be in December.

1. 1. What i did

(1) Obtain data on new corona-infected persons from the Ministry of Health, Labor and Welfare home page (2) Decompose into trends, seasonal factors, and residuals with Python's StatsModel

(Special Thanks to) I referred to Let's start data analysis with Momoki. thank you very much.

2. Obtained data on new corona infected persons from the Ministry of Health, Labor and Welfare home page

Download the number of positives from the Ministry of Health, Labor and Welfare website. I was very impressed with how easy it was to download csv data. Ministry of Health, Labor and Welfare is amazing!

For detailed analysis methods, refer to "Getting Started with Momoki and Data Analysis" above.

First of all, preparatory work.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
% matplotlib inline

Then, the downloaded data is read. Contains data from January 16th.

df=pd.read_csv('pcr_positive_daily.csv')
df.head()

pic1.png

By the way, the latest data is from the day before yesterday. Basically, you can get the data of the previous day, but it will be updated in the evening. Erai is updated even on Saturdays and Sundays!

df.tail()

pic2.png

Now, let's take a look at the changes in the number of infected people up to the present since the latter half of January when the data was released.

%matplotlib inline
df.plot()

PCR1.png

By the way, if you use matplotlib as it is, it becomes so-called "tofu" (laughs). The following page was very helpful for the solution. Thank you very much. How to translate Google collaboration graph (matplotlib) into Japanese!

3. 3. Decompose into trends, seasonal factors, and residuals with Python's StatsModel

Well, finally the main subject. MHLW data is decomposed using Python's StatsModel.

numbers = pd.Series(df['Number of positive PCR tests(Single day)'], dtype='float')
numbers.index = pd.to_datetime(df['date'])

res = sm.tsa.seasonal_decompose(numbers)

original = numbers #Original data
trend = res.trend #Trend data
seasonal = res.seasonal #Seasonal data
residual = res.resid #Residual data

plt.figure(figsize=(8, 8)) #Graph drawing frame creation, size specification

#Original data plot
plt.subplot(411) #Graph 4 rows 1 column 1st position (top)
plt.plot(original)
plt.ylabel('Original')

#trend data plot
plt.subplot(412) #Second position in 4 rows and 1 column of the graph
plt.plot(trend)
plt.ylabel('Trend')

#Plot of seasonal data
plt.subplot(413) #3rd position in 4 rows and 1 column of graph
plt.plot(seasonal)
plt.ylabel('Seasonality')

#plot of residual data
plt.subplot(414) #4th position in 4 rows and 1 column of graph (bottom)
plt.plot(residual)
plt.ylabel('Residuals')

plt.tight_layout() #Automatic adjustment of graph spacing

The results are as follows. I wrote it in the code, but in order from the top, ① Original data ② Trend data ③ Seasonal data ④ Residual data It will be.

PCR2.png

Please pay attention to the second trend data. It can be seen that the first peak is in early April and this peak is in early August. From that point of view, is the next peak in early December?

4. Finally

I just pray that the new Corona will end early. However, we recognize that the reality is harsh.

The disease name of the new corona is COVID-19, but the virus name is SARS-CoV-2. It seems that this virus name is similar to SARS.

The SARS became popular in 2002, so it was almost 20 years ago. However, it seems that the SARS vaccine has not yet been made.

I would like to do what I can quietly because it is a difficult time.

I feel lonely that the drinking party has disappeared due to the influence of the new corona, It was also a good opportunity to promote rational work styles such as promoting telecommuting.

Let's live calmly at such times: relaxed:

Last but not least, I would like to thank all the people involved in the site for their reference.

Recommended Posts

[New Corona] Is the next peak in December? I tried trend analysis with Python!
I tried to streamline the standard role of new employees with Python
I tried "smoothing" the image with Python + OpenCV
I tried "differentiating" the image with Python + OpenCV
I tried simulating the "birthday paradox" in Python
I tried the least squares method in Python
I tried "binarizing" the image with Python + OpenCV
I tried to get and analyze the statistical data of the new corona with Python: Data of Johns Hopkins University
I tried to automatically send the literature of the new coronavirus to LINE with Python
I tried to graph the packages installed in Python
I tried to touch the CSV file with Python
I tried to solve the soma cube with python
I tried to solve the problem with Python Vol.1
I tried hitting the API with echonest's python client
I tried fp-growth with python
I tried scraping with Python
I tried gRPC with Python
I tried scraping with python
I tried to find the entropy of the image with python
I tried "gamma correction" of the image with Python + OpenCV
I tried to simulate how the infection spreads with Python
I tried the accuracy of three Stirling's approximations in python
I tried using the Python library from Ruby with PyCall
[Python] I tried to visualize tweets about Corona with WordCloud
I tried programming the chi-square test in Python and Java.
I tried to divide the file into folders with Python
I tried the same data analysis with kaggle notebook (python) and Power BI at the same time ②
I made a class to get the analysis result by MeCab in ndarray with python
I tried the same data analysis with kaggle notebook (python) and Power BI at the same time ①
I also tried to imitate the function monad and State monad with a generator in Python
I wrote a doctest in "I tried to simulate the probability of a bingo game with Python"
I tried running the offline speech recognition system Julius with python in the Docker virtual environment
I tried scraping the ranking of Qiita Advent Calendar with Python
How is the progress? Let's get on with the boom ?? in Python
I tried to describe the traffic in real time with WebSocket
I tried to solve the ant book beginner's edition with python
Movement that changes direction in the coordinate system I tried Python 3
I tried web scraping with python.
Is the Serverless environment more than 600 times slower? ~ I tried benchmarking with Go, Node.js and Python! ~
I want to output the beginning of the next month with Python
I tried to process the image in "sketch style" with OpenCV
Stock price plummeted with "new corona"? I tried to get the Nikkei Stock Average by web scraping
Is the new corona really a threat? Validated with Stan (was)
I tried to find out if ReDoS is possible with Python
I liked the tweet with python. ..
[Python] The status of each prefecture of the new coronavirus is only published in PDF, but I tried to scrape it without downloading it.
I tried to process the image in "pencil style" with OpenCV
I wrote the queue in Python
A story that didn't work when I tried to log in with the Python requests module
I tried to improve the efficiency of daily work with Python
I tried running prolog with python 3.8.2.
I tried Line notification in Python
I tried SMTP communication with Python
I wrote the stack in Python
I'm tired of Python, so I analyzed the data with nehan (corona related, is that word now?)
I tried to open the latest data of the Excel file managed by date in the folder with Python
I tried to refactor the template code posted in "Getting images from Flickr API with Python" (Part 2)
[Python] I tried to summarize the set type (set) in an easy-to-understand manner.
I set the environment variable with Docker and displayed it in Python
I tried to get the authentication code of Qiita API with Python.
I tried using "Streamlit" which can do the Web only with Python