[PYTHON] Predict the number of people infected with COVID-19 with Prophet

Overview

Google has begun publishing COVID-19 forecast data on its dashboard Second decoction, but I tried how Prophet would predict I used this because the data on the number of infected people in Japan was in Kaggle.

--Implementation: November 24, 2020 --Package: Prophet

Predict the number of infected people with Prophet

data set Period: 2020/2/6 ~ 2020/11/20 (It seems to be updated every 3 days) Domestic: Domestic Airport: Airport inspection Returnee: Returnee Positive: Number of negatives Tested: Number of inspectors There are other columns, but data loss was scattered, so this time I will use Domestic and Airport Positive

import numpy as np 
import pandas as pd 
from fbprophet import Prophet
from fbprophet.plot import add_changepoints_to_plot

df = pd.read_csv('covid_jpn_total_1124.csv')
df_dom = df[df['Location'] == 'Domestic']
#print(df_dom.isnull().sum())
df_air = df[df['Location'] == 'Airport']
#print(df_air.isnull().sum())

df_air = df_air.dropna(how='any')
print(df_air.describe)

image.png

Raw data was cumulative, so take the difference daily pos_def: Positive number / day test_def: Tested number / day (I intend to use it to predict the negative rate, but I will not use it this time)

arr3 = [0]
arr1 = np.array(df_dom.iloc[1:,2])
arr2 = np.array(df_dom.iloc[:-1,2]) 
arr3 = np.append(arr3, arr1 - arr2)
df_dom['pos_def'] = arr3

arr3 = [0]
arr1 = np.array(df_dom.iloc[1:,3])
arr2 = np.array(df_dom.iloc[:-1,3]) 
arr3 = np.append(arr3, arr1 - arr2)
df_dom['test_def'] = arr3

Prepare a Dataframe according to the Prophet specifications

df_test = pd.DataFrame()
df_test['ds'] = pd.to_datetime(df_air['DS'])
df_test['y'] = df_air['pos_def']
print(df_test)
df_test.iloc[:,1].plot()

image.png

Fit the Prophet model to the prepared data and execute the prediction including the next 30 days

m = Prophet(yearly_seasonality=False, weekly_seasonality=True, daily_seasonality=True)
m.fit(df_test)
future = m.make_future_dataframe(periods=30, freq='D', include_history=True)
#future.tail()
forecast = m.predict(future)
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()

Draw result

fig = m.plot(forecast, figsize=(20, 10))
ax = add_changepoints_to_plot(fig.gca(), m, forecast)
ax = fig.gca()
ax.set_title("Positive", size=16)
ax.set_xlabel("date", size=16)
ax.set_ylabel("# Positives", size=16)
ax.tick_params(axis="x", labelsize=14)
ax.tick_params(axis="y", labelsize=14)

image.png

Black dots are actual data (Ground Truth) The light blue region shows the upper and lower limits of the 95% confidence interval. As shown in the graph, the model can trace the data accurately. When the number of infected people exceeded 3,000 in early December, the third wave was predicted to converge.

It's too optimistic for everyone, but the data would make such a prediction based on past changes in the number of infected people. Prophet captures seasonal fluctuations, but since the data used is less than a year, there is no tendency for the number of infected people to increase because it is winter. If the corona is widespread for three or four years, I think that such a tendency will be visible in the data, but I hope Accuracy cannot be expected unless other explanatory variables * 1 are added and multivalidate.

In addition, Google's current (11/24) forecast is as shown in the figure below. Similar to the Prophet result, the number exceeded 3,000 in early December, but has increased steadily since then.

image.png

Incidentally, the positive judgment prediction in Prophet's airport inspection is shown in the figure below. image.png

After a few weeks, I will re-predict with the same code and compare it with this result, maybe around 4000 people.

Recommended Posts

Predict the number of people infected with COVID-19 with Prophet
Let's visualize the number of people infected with coronavirus with matplotlib
I tried to predict the number of domestically infected people of the new corona with a mathematical model
I tried to predict the number of people infected with coronavirus in Japan by the method of the latest paper in China
I tried to predict the number of people infected with coronavirus in consideration of the effect of refraining from going out
Count the number of characters with echo
Predict the second round of summer 2016 with scikit-learn
Considering the situation in Japan by statistician Nate Silver, "The number of people infected with coronavirus is meaningless"
Create a BOT that displays the number of infected people in the new corona
Predict the gender of Twitter users with machine learning
Manage the package version number of requirements.txt with pip-tools
10. Counting the number of lines
Get the number of digits
Calculate the number of changes
Try scraping the data of COVID-19 in Tokyo with Python
A network diagram was created with the data of COVID-19.
[Homology] Count the number of holes in data with Python
A server that returns the number of people in front of the camera with bottle.py and OpenCV
[Introduction to SIR model] Predict the end time of each country with COVID-19 data fitting ♬
Predict Bitcoin price changes with Prophet
Get the number of PVs of Qiita articles you posted with API
Get the number of views of Qiita
Calculation of the number of Klamer correlations
Get the number of Youtube subscribers
Predict the number of cushions that can be received as laughter respondents with Word2Vec + Random Forest
Get the number of searches with a regular expression. SeleniumBasic VBA Python
Generate a list packed with the number of days in the current month.
Get the number of visits to each page with ReportingAPI + Cloud Functions
Get the number of articles accessed and likes with Qiita API + Python
Display the status of COVID 19 infection in Japan with Splunk (GitHub version)
I tried to predict the behavior of the new coronavirus with the SEIR model.
Align the size of the colorbar with matplotlib
Count / verify the number of method calls.
The story of verifying the open data of COVID-19
Check the existence of the file with python
The third night of the loop with for
The second night of the loop with for
Expand any number of arguments with yasnippet
The advantages and disadvantages of Django that people with one year of experience think
Predict the number of titles won by Souta Fujii 7th Dan by gradient boosting
Let Code Day10 Starting from Zero "1431. Kids With the Greatest Number of Candies"
Align the number of samples between classes of data for machine learning with Python
Output the number of CPU cores in Python
The story of doing deep learning with TPU
Note: Prepare the environment of CmdStanPy with docker
Prepare the execution environment of Python3 with Docker
Convert data with shape (number of data, 1) to (number of data,) with numpy.
2016 The University of Tokyo Mathematics Solved with Python
[Note] Export the html of the site with python.
See the behavior of drunkenness with reinforcement learning
Increase the font size of the graph with matplotlib
Check the date of the flag duty with Python
Challenge the Tower of Hanoi with recursion + stack
Rewrite the name of the namespaced tag with lxml
Fill the browser with the width of Jupyter Notebook
Minimize the number of polishings by combinatorial optimization
Dump the contents of redis db with lua
Tucker decomposition of the hay process with HOOI
Find out the day of the week with datetime
The basis of graph theory with matplotlib animation
Visualize the behavior of the sorting algorithm with matplotlib