[PYTHON] [SIR model analysis] Transform the formula to determine γ and the effective reproduction number R ♬

When the latest data is naturally extrapolated, the figure below is obtained. I think people who can read the graph will understand it, but I realized that it was a very bad situation. exterpolate_Japan_gamma_R_5.png

So, I would like to talk about the following without touching on this topic. The data handled this time is the data up to 3/4/2020 posted on the reference site. For the number of infections in Japan, the value read from Reference (2) was used. 【reference】 ①Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE (2) New coronavirus seen in data and graphs @ NTV NEWS

What i did

・ Coefficient transformation of SIR model ・ Meaning of SIR model ・ Result explanation ・ Code explanation

・ Coefficient transformation of SIR model

This time, I will try to find the daily change of $ \ gamma $ and the effective reproduction number by transforming the following SIR model as follows.

{\begin{align}
\frac{dS}{dt} &= -\beta \frac{SI}{N} \\
\frac{dI}{dt} &=  \beta \frac{SI}{N} -\gamma I \\
\frac{dr}{dt} &=  \gamma I \\
\end{align} 
}

First, the third equation can be transformed as follows, and since $ I $ and $ r $ are given observation values, $ \ gamma $ can be obtained.

{\begin{align}
\frac{1}{I}\frac{dr}{dt} &=  \gamma \\
\end{align} 
}

Next, if you substitute this $ \ gamma $ into the second equation

{\begin{align}
\frac{1}{\gamma I}\frac{dI}{dt} &= R-1\\
R &=  \frac{\beta}{\gamma} \frac{S}{N}\\
\end{align} 
}

Therefore, R is obtained.

・ Meaning of SIR model

Using these two quantities, the above differential equation can be rewritten as follows.

{\begin{align}
\frac{dS}{dt} &= -\gamma R I \\
\frac{dI}{dt} &=  \gamma (R - 1) I \\
\frac{dr}{dt} &=  \gamma I \\
\end{align} 
}

Here, $ \ gamma $ is a factor related to (recovery rate, quarantine rate, and mortality rate) and means the rate at which an infected person is removed from the infection. On the other hand, $ R $ is called the effective reproduction number, which is a quantity related to the number of people infected by one infected person. Therefore, the first equation shows that the larger $ R $ and $ \ gamma $, the more the infection progresses in proportion to the infected person. This is the reason why the spread of infection increases (exponentially) in so-called murine calculations. On the other hand, the third equation means that the larger $ \ gamma $ is, the more heals in proportion to the infected person. The second equation is proportional to $ (R-1) $, and it is shown that the positive or negative value of this value determines whether the number of infected persons increases or decreases. Moreover, since it is proportional to the number of infected people, it can be seen that the change is exponential. And this change is also proportional to $ \ gamma $.

・ Result explanation

removed_Japan_gamma_R_5.png The first thing that draws attention this time is that the number of infections (red) has changed significantly from the 64th day (April 3 to 9 days ago). And the main purpose of this time is to find $ \ gamma $ and $ R $ for this epidemic in Japan. However, the above and above all, the daily data fluctuates greatly, so the problem was how to deal with it. If you look at the data, you can see that the curve of the number of cures is terrible. I thought about this as so-called smoothing (3 points, 5 points, 7 points ...), but it's worse than that, and as a whole, it changes almost linearly, so I fit it with an exponential function. I decided. Similarly, the number of infections was also fitted, but this was done only on the straight line. And this time, it was only used for the extrapolation presented at the beginning, not for this coefficient calculation. The $ I / (R + D) $ curve plots the ratio of infected to non-contributors. This number is the number at which the healing rate increases, and when this number decreases and falls below 1, the end is finally visible. In Japan, it was expected that the number would once be 3 or less and then proceed to 1 as it is, but after the 64th day above, this number also started to rise significantly. What's more, it had been weakly decreasing until then, but it started to rise linearly, which is really surprising. The lower graph shows the $ \ gamma $ and $ R $ that we wanted to find this time, the number of daily cures (5-day average) used in the calculation, and their approximate curves. $ \ Gamma $ divides the daily number of cures (derivative of the cure curve) by the number of infections. However, since the healing number curve fluctuates greatly, the approximate curve was used here. Even with this, there was considerable fluctuation due to the fluctuation of the infection number curve. However, changes were seen every day, and although this number also increased steadily until the 64th day, it fell linearly from there. On the other hand, $ R $ is inversely proportional to this $ \ gamma $, and the result works that way. However, the lowest value is about 3 effective reproduction numbers, and the position is slightly shifted to the left. In any case, this area has completely increased, and it can be seen that the number of effective reproductions is increasing to around 10. The latest values are as follows. The effective reproduction number is a comparison with 1, which is a very large value (a value at which the infection spreads rapidly).

{\begin{align}
\gamma &= 1.6e^{-2} \\
R &= 10 \\
\end{align} 
}

The actual contribution depends on $ \ gamma (R-1) $, and the graph above shows that the contribution of the decrease of $ \ gamma $ and the contribution of the increase of $ R-1 $ are balanced. Therefore, when this amount is graphed, it becomes as follows. As a result, the contribution of $ R $ was large, and once it decreased to around 0.05, it increased after the 64th day, and now it can be seen that it has increased to around 0.14. It can be said that the infection probability has tripled in the last two weeks. removed_Japan_gammaR_5.png

・ Code explanation I put the whole code below collective_particles/fitting_gamma_R.py This time, I will explain only the main parts. First, data reading and processing are as follows.

city = "Japan"

skd=5 #2 #1 #4 #3 #2 #slopes average factor
#Process the data
t_cases = 0
t_recover = 0
t_deaths = 0
for i in range(0, len(data_r), 1):
    if (data_r.iloc[i][1] == city): #for country/region
    #if (data_r.iloc[i][0] == city):  #for province:/state  
        print(str(data_r.iloc[i][0]) + " of " + data_r.iloc[i][1])
        for day in range(4, len(data.columns), 1):            
            confirmed_r[day - 4] += data_r.iloc[i][day]
            if day < 4+skd:
                day_confirmed_r[day-4] += data_r.iloc[i][day]
            else:
                day_confirmed_r[day-4] += (data_r.iloc[i][day] - data_r.iloc[i][day-skd])/(skd)
        t_recover += data_r.iloc[i][day]        
for i in range(0, len(data_d), 1):
    if (data_d.iloc[i][1] == city): #for country/region
    #if (data_d.iloc[i][0] == city):  #for province:/state  
        print(str(data_d.iloc[i][0]) + " of " + data_d.iloc[i][1])
        for day in range(4, len(data.columns), 1):
            confirmed_d[day - 4] += data_d.iloc[i][day] #fro drawings
        t_deaths += data_d.iloc[i][day]        
for i in range(0, len(data), 1):
    if (data.iloc[i][1] == city): #for country/region
    #if (data.iloc[i][0] == city):  #for province:/state  
        print(str(data.iloc[i][0]) + " of " + data.iloc[i][1])
        for day in range(4, len(data.columns), 1):
            confirmed[day - 4] += data.iloc[i][day] -  confirmed_r[day - 4] -confirmed_d[day-4]
            diff_confirmed[day - 4] += confirmed[day-4] /  (confirmed_r[day - 4]+confirmed_d[day-4])
            if day == 4:
                day_confirmed[day-4] += data.iloc[i][day]
            else:
                day_confirmed[day-4] += data.iloc[i][day] - data.iloc[i][day-1]

Healing curve fitting is performed below.

t_max=len(confirmed)
dt=1
t=np.arange(0,t_max,dt)
t1=t

obs_i = confirmed_r[:]
#function which estimate i from seir model func 
def estimate_i(ini_state,r0,alpha):
    est = r0*np.exp(alpha*(t))
    return est

def y(params):
    est_i=estimate_i(ini_state,params[0],params[1])
    return np.sum((est_i-obs_i)*(est_i-obs_i))
r0=1
alpha = 1
ini_state=[5.70579672, 0.00755685]
#optimize logscale likelihood function
mnmz=minimize(y,ini_state,method="nelder-mead")
print(mnmz)
r0,alpha = mnmz.x[0],mnmz.x[1]
est=estimate_i(ini_state,r0,alpha)

The various quantities $ \ gamma $, $ R $, $ C = \ gamma * (R-1) $ are calculated below.

diff_est=[0] * (len(data.columns) - 4)
gamma_est=[0] * (len(data.columns) - 4)
R_est = [0] * (len(data_d.columns) - 4)
R_0 = [0] * (len(data_d.columns) - 4)
C = [0] * (len(data_d.columns) - 4)
for i in range(1,t_max):
    diff_est[i]=est[i]-est[i-1]
for i in range(0, len(confirmed), 1):        
    if confirmed[i] > 0:    
        gamma_est[i]=diff_est[i]/confirmed[i]
        R_est[i]= 1+day_confirmed[i]/diff_est[i] # diff_est=gamma*confirmed
        C[i]=gamma_est[i]*(R_est[i]-1)
    else:
        continue

After that, the above calculation result is drawn.

Summary

・ Gamma and R were obtained from real data ・ It can be said that the situation in Japan has just entered a dangerous zone.

・ I'm not looking for β, so I'll try it. ・ I would like to compare each country with the various parameters obtained, including those obtained by differential equations.

Recommended Posts

[SIR model analysis] Transform the formula to determine γ and the effective reproduction number R ♬
[SIR model analysis] Determination of γ * (R-1) and peak out of infection number ♬ World edition
Introduction to Time Series Analysis ~ Seasonal Adjustment Model ~ Implemented in R and Python
Experiment and leave evidence to determine the specifications.
Determine the number of classes using the Starges formula
[SIR model analysis] Peak out of infection numbers in Japan and around the world ♬
[SIR model analysis] Peak out of infection numbers in Japan and around the world ♬ Part 2