[PYTHON] I tried to predict the number of people infected with coronavirus in Japan by the method of the latest paper in China

at first

I am not an infectious disease expert, so please read it with an understanding.

The new type of corona pneumonia (Covid-19) that occurred in Wuhan, Hubei Province, China from December 2019 has spread to Japan and the number of infected people is increasing. I am interested in how the number of infected people in Japan will increase in the future. Therefore, I searched for papers on infection prediction models. Infection models have already been announced from all over the world, but parameters such as infection rate and quarantine rate have a great influence on the accuracy when predicting with the infection model. If these parameters differ from the actual values, the prediction will be far from the actual situation.

The other day, a Chinese researcher announced a model for predicting the number of people infected with coronavirus. COVID-19 in Japan: What could happen in the future?

In this paper, we predict that the number of infected people can be predicted accurately by applying the prediction model to various parts of China (Wuhan, Beijing, Shanghai ...), and that the number of infected people will increase in the future by applying it to Japan. I am. Also, in this paper, the infection rate and quarantine rate, which are the parameters necessary for prediction, were announced (although it would be correct to say that the parameters were adjusted to match the actual number of infected people).

Predictive model

The SIR and SEIR models are used in the infectious disease model, but this paper uses a slightly different model. We call it the statistical time delay dynamic model.

model_picture.png

[Quote COVID-19 in Japan: What could happen in the future?]

The calculation formula is as follows. model.png t: day I (t): Cumulative number of infected people J (t): Cumulative number of infected people (confirmed onset at hospital) G (t): Infected person who occurred at that time (not the cumulative value, the onset is not confirmed) I0 (t): Number of potential infected persons (infected but not confirmed or quarantined)       I0(t) = I(t) − J(t) − G(t) [Quote COVID-19 in Japan: What could happen in the future?]

Parameters

There are important parameters for using this model: incidence and hospitalization. As for the parameters, a Chinese example was described in this paper.

area growth rate Infection rate l1 Infection rate l2 tl
Shanghai 0.3137 0.1713 0.6149 2020/1/16
Beijing 0.3125 0.1824 0.5880 2020/1/17
Wuhan 0.3019 0.1142 0.4567 2020/1/17

r Growth rate

Growth rate is the rate at which one person infects another. This was set as follows based on the figures in China in the paper.   r = 0.3

l Isolation rate

The quarantine rate is the rate at which infected people are quarantined. This was set as follows based on the figures in China in the paper. l = 0.1 (until 2020/2/28) l = 0.5 (from February 29, 2020)

Incidence rate

f2 (t) Probability of transition from infection to onset This was not mentioned in the paper. I have heard the news that 70-80% of people do not develop the disease even if they are infected. In addition, since it takes 14 days at the longest to develop the disease, we set it as follows.  f2(t) = 0.2/14 × t (t < 14)  f2(t) = 0.2 (t >= 14)

Hospitalization rate

f4 (t) Probability of transition from infection to hospitalization The rate of infection and hospitalization is completely unknown. Therefore, assuming that 1/4 of the affected people will be hospitalized, the settings are as follows.  f4(t) = 0.05/14 × t (t < 14)  f4(t) = 0.05 (t >= 14)

program

predict.py


from __future__ import print_function
import numpy as np
import pandas as pd

default_output  = 'predict.csv'

class  Corona():
    def __init__(self, max_day):
        self.r  = 0.3 #growth rate
        self.tl = 15  #Quarantine start date
        self.l1 = 0.1 #Isolation rate before the quarantine start date
        self.l2 = 0.5 #Isolation rate after the quarantine start date
        self.i  = np.zeros(max_day + 1)
        self.i0 = np.zeros(max_day + 1)
        self.j  = np.zeros(max_day + 1)
        self.g  = np.zeros(max_day + 1)

    def set_start(self):
        # self.j[0]  = 11 # 2020/1/30
        self.j[0]  = 21 # 2020/2/14
        self.i[0]  = self.j[0] * 10
        self.g[0]  = 0
        self.i0[0] = self.func_i0(0)

    def f2(self, t):
        #Incidence rate
        if t < 14:
            a = 0.2/14.0
            b = 0.0
            y = a * t + b
        else:
            y = 0.2
        return y

    def f4(self, t):
        #Hospitalization rate
        if t < 14:
            a = 0.05/14.0
            b = 0.0
            y = a * t + b
        else:
            y = 0.05
        return y

    def func_l(self, t):
        #Isolation rate
        if t < self.tl:
            return self.l1
        else:
            return self.l2

    def func_i(self, t):
        #Cumulative number of infected people
        # I(t + 1) = I(t) + r I0(t),
        new_i = self.i[t] + self.r * self.i0[t]
        return new_i

    def func_j(self, t):
        #Cumulative number of infected people(Confirmed at the hospital)
        # J(t + 1) = J(t) + r Σs<t f4(t - s) I0(s)
        sum1 = 0
        for s in range(t):
            sum1 += self.f4(t - s) * self.i0[s]
        new_j = self.j[t] + self.r * sum1
        return new_j

    def func_g(self, t):
        #Infected person who occurred momentarily(Not confirmed to be infected at the hospital)
        # G(t + 1) = G(t) + f2(t) Σs<t f2(t - s) I0(s) - Σs<t f4(t - s) I0(s).
        sum1 = 0
        sum2 = 0
        for s in range(t):
            sum1 += self.f2(t - s) * self.i0[s]
        for s in range(t):
            sum2 += self.f4(t - s) * self.i0[s]
        new_g = self.g[t] + self.func_l(t) * sum1 - self.func_l(t) * sum2
        return new_g

    def func_i0(self, t):
        #Number of potential infections(Infected but not confirmed or quarantined)
        # I0(t) := I(t) - J(t) - G(t)
        new_i0 = self.i[t] - self.j[t] - self.g[t]
        if new_i0 < 0.0:
            new_i0 = 0.0
        return new_i0

    def predict(self, day):
        #Initialization
        period = day + 1
        predict_data = np.zeros([period, 5])
        df_predict = pd.DataFrame(predict_data, columns=['day', 'I', 'J', 'G', 'I0'])
        self.set_start()

        #Forecast
        for i in range(period - 1):
            self.i[i+1]  = self.func_i(i)
            self.j[i+1]  = self.func_j(i)
            self.g[i+1]  = self.func_g(i)
            self.i0[i+1] = self.func_i0(i)

            df_predict.loc[i, 'day'] = i+1
            df_predict.loc[i, 'I']   = self.i[i+1]
            df_predict.loc[i, 'J']   = self.j[i+1]
            df_predict.loc[i, 'G']   = self.g[i+1]
            df_predict.loc[i, 'I0']  = self.i0[i+1]

        return df_predict

def main():
    corona = Corona(25)
    predict = corona.predict(25)
    predict.to_csv(default_output, index=False)
 
if __name__ == "__main__":
    main()

result

Number of infected people (those who have been confirmed to be infected at the hospital)

We obtained the number of infected people at the start of the simulation from the materials released daily in Ministry of Health, Labor and Welfare Press Release. Predicted from 2/15 based on 21 infected people on 2/14/2020. The forecast results are as follows.

predict_japan1_20200303.png

date Number of infected people(Announced by the Ministry of Health, Labor and Welfare) Number of infected people(Forecast)
2020/2/14 21 21
2020/2/15 21 21
2020/2/16 21 21
2020/2/17 46 22
2020/2/18 53 23
2020/2/19 60 25
2020/2/20 70 29
2020/2/21 79 35
2020/2/22 90 43
2020/2/23 114 54
2020/2/24 126 69
2020/2/25 140 90
2020/2/26 149 118
2020/2/27 171 153
2020/2/28 195 200

The error is large on the way, but on 2/28, the error was only 5 people. (Maybe it happens ...)

Number of potential infections (those who have not been confirmed to be infected at the hospital)

The above number of infected people is only those who have been confirmed to be infected at the hospital. The simulation also calculates the number of potential infections. The result is shown in the figure below. Orange: People confirmed to be infected at the hospital (cumulative) Navy blue: Hidden infected persons (cumulative) whose infection has not been confirmed at the hospital There are far more hidden infected people than those who have been confirmed to be infected at the hospital, about 10 times as many.

predict_japan2_20200303.png

Impressions

Is the Ministry of Health, Labor and Welfare also conducting such a simulation?

References

https://www.medrxiv.org/content/10.1101/2020.02.21.20026070v2

2020/3/3 change

Program fix

Others have written the Hatena Blog using the program in this post.   https://kibashiri.hatenablog.com/entry/2020/03/02/171223 On his blog, he pointed out a mistake in the program. It was certainly a sign error in the calculation of func_g (). Fixed the program and results.

Prediction result

It is predicted that the number of infected people after 2/29 will increase by nearly 100 every day. The Ministry of Health, Labor and Welfare has announced a new number of infected people (domestic cases-excluding charter flight returnees), so I compared it with the forecast results.

date Number of infected people(Announced by the Ministry of Health, Labor and Welfare) Number of infected people(Forecast) PCR test number(In one day)
2020/2/29 215 259 130
2020/3/1 224 334 178
2020/3/2 239 428 96
2020/3/3 253 546 71

The predicted value is significantly different from the actual number of infected people, and it seems that the predicted performance after 2/29 was not good. I tried to predict the number of infected people in Japan with the same parameters as in China, but I found that the situation is different between Japan and China, and there is a limit to the prediction with the same parameters.

According to the media, the PCR testing capacity of Japan was reported to be 3800 people / day, but the actual number of people tested per day is 130, 178, 96, and 71, which is unexpectedly small.

Recommended Posts

I tried to predict the number of people infected with coronavirus in Japan by the method of the latest paper in China
I tried to predict the number of people infected with coronavirus in consideration of the effect of refraining from going out
I tried to predict the number of domestically infected people of the new corona with a mathematical model
Considering the situation in Japan by statistician Nate Silver, "The number of people infected with coronavirus is meaningless"
I tried to predict the behavior of the new coronavirus with the SEIR model.
Predict the number of people infected with COVID-19 with Prophet
I tried to open the latest data of the Excel file managed by date in the folder with Python
I tried to summarize the new coronavirus infected people in Ichikawa City, Chiba Prefecture
I tried to visualize the characteristics of new coronavirus infected person information with wordcloud
Let's visualize the number of people infected with coronavirus with matplotlib
I tried to tabulate the number of deaths per capita of COVID-19 (new coronavirus) by country
I tried to predict the sales of game software with VARISTA by referring to the article of Codexa
I tried to predict the price of ETF
I tried to predict the horses that will be in the top 3 with LightGBM
I tried to predict the presence or absence of snow by machine learning.
I tried to predict the change in snowfall for 2 years by machine learning
I tried to automatically send the literature of the new coronavirus to LINE with Python
python beginners tried to predict the number of criminals
I wanted to know the number of lines in multiple files, so I tried to get it with a command
I tried to find the average of the sequence with TensorFlow
I tried to get the number of days of the month holidays (Saturdays, Sundays, and holidays) with python
I wrote a doctest in "I tried to simulate the probability of a bingo game with Python"
I tried to find the trend of the number of ships in Tokyo Bay from satellite images.
Convert PDF of the situation of people infected in Tokyo with the new coronavirus infection of the Tokyo Metropolitan Health and Welfare Bureau to CSV
I tried to describe the traffic in real time with WebSocket
I tried to automate the watering of the planter with Raspberry Pi
I tried to process the image in "sketch style" with OpenCV
I tried to predict by letting RNN learn the sine wave
I tried to process the image in "pencil style" with OpenCV
I tried to expand the size of the logical volume with LVM
I tried to summarize the frequently used implementation method of pytest-mock
I tried to improve the efficiency of daily work with Python
I tried to verify the speaker identification by the Speaker Recognition API of Azure Cognitive Services with Python. # 1
I tried to verify the speaker identification by the Speaker Recognition API of Azure Cognitive Services with Python. # 2
I tried to summarize the contents of each package saved by Python pip in one line
I tried fitting the exponential function and logistics function to the number of COVID-19 positive patients in Tokyo
I tried to get the authentication code of Qiita API with Python.
I tried to automatically extract the movements of PES players with software
(Python) I tried to analyze 1 million hands ~ I tried to estimate the number of AA ~
I tried to find the optimal path of the dreamland by (quantum) annealing
I tried to verify and analyze the acceleration of Python by Cython
I tried to analyze the negativeness of Nono Morikubo. [Compare with Posipa]
I tried to streamline the standard role of new employees with Python
I tried to visualize the text of the novel "Weathering with You" with WordCloud
I tried fractal dimension analysis by the box count method in 3D
[Linux] I tried to verify the secure confirmation method of FQDN (CentOS7)
I tried to get the movie information of TMDb API with Python
I tried to display the altitude value of DTM in a graph
I tried the common story of using Deep Learning to predict the Nikkei 225
I tried to verify the result of A / B test by chi-square test
Create a bot that posts the number of people positive for the new coronavirus in Tokyo to Slack
[Python & SQLite] I tried to analyze the expected value of a race with horses in the 1x win range ①
Introduction to AI creation with Python! Part 2 I tried to predict the house price in Boston with a neural network
I tried to predict the deterioration of the lithium ion battery using the Qore SDK
I tried to easily visualize the tweets of JAWS DAYS 2017 with Python + ELK
Create a BOT that displays the number of infected people in the new corona
I tried to rescue the data of the laptop by booting it on Ubuntu
The story of making soracom_exporter (I tried to monitor SORACOM Air with Prometheus)
I tried to display the infection condition of coronavirus on the heat map of seaborn
I tried to create a model with the sample of Amazon SageMaker Autopilot
I tried to predict next year with AI