[PYTHON] Let's calculate the transition of the basic reproduction number of the new coronavirus by prefecture

Introduction

The center of infection for the new coronavirus infection (COVID-19) is already shifting from mainland China to Europe and the United States, and there are emergencies such as restricting free border movement within the EU and issuing a stay-at-home order. It's a situation. In Japan, the threat of the new coronavirus began to be recognized when the Diamond Princess arrived at Yokohama Port on February 3, 2020, the government announced its basic policy on February 25, and in Hokkaido on February 28. Although it was triggered by a state of emergency, concrete measures include a request to refrain from large-scale events from February 26, and temporary closure of elementary, junior high and high schools nationwide from March 2. About half a month has passed since then, and during that time, the Nikkei Stock Average has plummeted from the 22,000 yen level to the 17,000 yen level, and there is a dark cloud in the future. The effects of the new coronavirus infection have seriously affected not only health but also socio-economics, and when it will be resolved is of great concern. Therefore, in this article, I would like to present one method for calculating how the basic reproduction number, which is an index of the infectivity of the new coronavirus in Japan, has changed from the end of January to the beginning of March. I will. The calculation formula and code are a little long, so it may be better to look at the calculation result first.

What is the basic reproduction number?

The basic reproduction number seems to be very profound, [Research Institute for Mathematical Sciences, Kyoto University](https://repository.kulib.kyoto-u.ac.jp/dspace/bitstream/2433/140828/ If you look at 1 / 653-04.pdf) and Society for Mathematical Biology Letter, you can see the transition between generations. It doesn't seem simple, but in a nutshell,

Seems to be defined as. Assuming that this number is R0, ** the condition for spreading infection is $ R_0> 1 $, and the condition for converging infection is $ R_0 <1 $ **.

British Prime Minister Boris Johnson explained at a meeting on March 16 that about 60% of the British people would control the peak until they gained herd immunity (Newsweek article. //www.newsweekjapan.jp/kimura/2020/03/post-74.php)), assuming that $ R_0 = 2.6 $, the herd immunity rate $ H $ in vaccines and autoimmunity is ,

(1-H)R_0 < 1

It seems that $ H> 1-1 / R_0 = 0.615 $ was assumed from the condition that satisfies. In other words, herd immunity relaxes the condition of $ R_0 $. However, the development of vaccines is expected to take more than a year, including clinical trials, and it is a bold strategy to acquire herd immunity by not controlling infection excessively.

Calculation model of basic reproduction number

Now, in order to calculate the basic reproduction number, I would like to consider a simple model based on the above definition and SEIR model. See the following figure. R0_calculation.jpg

The upper row shows the date, and the lower bar shows the infected population one day. The meaning of each symbol is

Represents. That is, the production of I to E on one day t is an infection from a population of I who was infected and developed before t, and the number of reproductions of t on that day is $ R_0 (t) / ip $. It is a model called. Expressing this as an expression,

\frac{1}{ip} R_0(t) \times \sum_{s=t+1}^{t+ip} P(s) = P(t+lp+ip), \\
\therefore R_0(t) = \frac{ ip \times P(t+lp+ip)} {\sum_{s=t+1}^{t+ip} P(s)}

It will be. Below, let's use this formula to calculate the time transition of the basic reproduction number $ R_0 (t) $ from the published data of test positives.

Original data

In this article, we use the csv data published in Map of the number of new coronavirus infections by prefecture (provided by Jag Japan Co., Ltd.). I was allowed to do it. The data of prefectures are collected in one, which makes it very easy to use.

Try to calculate with Python

Now, let's use Python to calculate the basic reproduction number R0 (t) by prefecture.

Prerequisites

As a prerequisite, use the following.

There are various theories about lp and ip, but they are set as a guide. Also, due to the calculation formula, only $ R_0 (t) $ before lp + ip = 13 [day] can be calculated from the latest date of the data. Please note that on the graph, even if the value for the period from 13 days ago to the latest day is 0, it is meaningless.

code

First, import the library.

# coding: utf-8

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import datetime
import locale

Data by prefecture from csv provided in Map of the number of new coronavirus infections by prefecture (provided by Jag Japan Co., Ltd.) A function that reads and converts to a data frame.

def readCsvOfJapanPref(pref : None):
    #Download from the URL below
    # https://jag-japan.com/covid19map-readme/
    fcsv = u'COVID-19.csv'
    df = pd.read_csv(fcsv, header=0, encoding='utf8', parse_dates=[u'Confirmed date YYYYMMDD'])
    #Fixed date,Extract only the prefectures where you receive a medical examination
    df1 = df.loc[:,[u'Confirmed date YYYYMMDD',u'Consultation prefecture']]
    df1.columns = ['date','pref']
    #Extracted by consultation prefecture
    if pref is not None: #If None, then the whole of Japan
        df1 = df1[df1.pref == pref]
    df1 = df1.loc[:,'date']
    #Count on fixed date
    df2 = pd.DataFrame( df1.value_counts() ).reset_index()
    df2.columns = ['date','P']
    df2 = df2.sort_values('date')
    return df2

A function that defines a data frame for calculations. The column Ppre is for the denominator of the formula and Pat is for the numerator of the formula. Starting from January 24, 2020, we will create a calculation frame for days.

def makeCalcFrame(days):
    t_1 = pd.Timestamp(2020,1,24) #Calculation start date
    td = pd.Timedelta('1 days')
    #
    npd = [[t_1 + td * i, 0, 0, 0 ] for i in range(0,days)]
    df1 = pd.DataFrame(npd)
    df1.columns = ['date', 'Ppre','Pat', 'R0']
    #
    return df1

A function that combines prefectural data and calculation frames.

def mergeCalcFrame(df1, df2):
    return pd.merge(df1, df2, on='date', how='left').fillna(0)

A function that calculates $ R_0 $ according to the formula. To simplify the calculation, I created a lambda expression that returns NaN when accessed outside the index range.

def calcR0(df, keys):
    lp = keys['lp']
    ip = keys['ip']
    nrow = len(df)
    getP = lambda s: df.loc[s, 'P'] if s < nrow else np.NaN
    for t in range(nrow):
        df.loc[t, 'Ppre'] = sum([ getP(s) for s in range(t+1, t + ip + 1)])
        df.loc[t, 'Pat' ] = getP(t + lp + ip)
        if df.loc[t, 'Ppre'] > 0:
            df.loc[t, 'R0'  ] = ip * df.loc[t, 'Pat'] / df.loc[t, 'Ppre']
        else:
            df.loc[t, 'R0'  ] = np.NaN
    return df

A function that graphs the results.

def showResult(df, title):
    # R0=1 :Target for convergence
    ptgt = pd.DataFrame([[df.iloc[0,0],1],[df.iloc[len(df)-1,0],1]])
    ptgt.columns = ['date','target']
    # show R0
    plt.rcParams["font.size"] = 12
    ax = df.plot(title=title,x='date',y='R0', figsize=(10,7))
    ptgt.plot(x='date',y='target',style='r--',ax=ax)
    ax.grid(True)
    ax.set_ylim(0,)
    plt.show()

This is the part that calculates for each prefecture.

def R0inJapanPref(pref, label):
    keys = {'lp':5, 'ip':8 }
    df1 = makeCalcFrame(60) # 60 days
    df2 = readCsvOfJapanPref(pref)
    df = mergeCalcFrame(df1, df2)
    df = calcR0(df, keys)
    showResult(df, 'COVID-19 R0 ({})'.format(label))
    return df

preflist = [[None, 'Japan'], [u'Tokyo', 'Tokyo'],\
            [u'Osaka', 'Osaka'],  [u'Aichi prefecture', 'Aichi'],\
            [u'Hokkaido', 'Hokkaido']]
dflist = [[R0inJapanPref(pref, label), label] for pref, label in preflist]

This is the part that displays the above calculation results together.

def showResult2(ax, df, label):
    # show R0
    plt.rcParams["font.size"] = 12
    df1 = df.rename(columns={'R0':label})
    df1.plot(x='date',y=label, ax=ax)

# R0=1
dfs = dflist[0][0]
ptgt = pd.DataFrame([[dfs.iloc[0,0],1],[dfs.iloc[len(dfs)-1,0],1]])
ptgt.columns = ['date','target']
ax = ptgt.plot(title='COVID-19 R0', x='date',y='target',style='r--', figsize=(10,8))
#
for df, label in dflist:
    showResult2(ax, df, label)
#
ax.grid(True)
ax.set_ylim(0,10)
plt.show()

Calculation result

Now let's take a look at the calculation results.

Changes in the number of basic reproduction numbers in Japan as a whole

R0_COVID-19 R0 (Japan).png

Changes in the number of basic reproduction numbers in Tokyo

R0_COVID-19 R0 (Tokyo).png

Changes in the number of basic reproduction numbers in Osaka Prefecture

R0_COVID-19 R0 (Osaka).png

Changes in the basic reproduction number in Aichi Prefecture

R0_COVID-19 R0 (Aichi).png

Changes in the number of basic reproduction numbers in Hokkaido

R0_COVID-19 R0 (Hokkaido).png

Compare the whole

R0_all.png

Consideration

From the above, the following trends can be derived from the simulation regarding the estimation of the basic reproduction number R0 based on the infected person data of each prefecture in Japan.

Furthermore ...

Reference link

I referred to the following page. Map of the number of people infected with the new coronavirus by prefecture (provided by Jag Japan Co., Ltd.) Research Institute for Mathematical Sciences, Kyoto University Society for Mathematical Biology Letter Newsweek article SEIR model Takatsuki City Page

I haven't quoted it directly, but I'll add a very useful link. COVID-19 reports National Cluster Map

Recommended Posts

Let's calculate the transition of the basic reproduction number of the new coronavirus by prefecture
Let's put out a ranking of the number of effective reproductions of the new coronavirus by prefecture
Let's test the medical collapse hypothesis of the new coronavirus
Let's visualize the number of people infected with coronavirus with matplotlib
Plot the spread of the new coronavirus
I tried to tabulate the number of deaths per capita of COVID-19 (new coronavirus) by country
Did the number of store closures increase due to the impact of the new coronavirus?
Calculate the total number of combinations with python
Let's check the population transition of Matsue City, Shimane Prefecture with open data
Minimize the number of polishings by combinatorial optimization
Factfulness of the new coronavirus seen in Splunk
GUI simulation of the new coronavirus (SEIR model)
Let's visualize the rainfall data released by Shimane Prefecture
IDWR bulletin data scraping the number of reports per fixed point of influenza and by prefecture
Let's simulate the effect of introducing a contact tracking app as a countermeasure against the new coronavirus
Significance of narrowing down the test target of PCR test for new coronavirus understood by Bayes' theorem
Let's decide the lecture of PyCon JP 2016 by combinatorial optimization
Let's decide the position of the fire station by combinatorial optimization
Quantify the degree of self-restraint required to contain the new coronavirus
Let's summarize the basic functions of TensorFlow by creating a neural network that learns XOR gates
10. Counting the number of lines
Get the number of digits
Let's visualize the river water level data released by Shimane Prefecture
[Python] Let's reduce the number of elements in the result in set operations
Considering the situation in Japan by statistician Nate Silver, "The number of people infected with coronavirus is meaningless"
Create a bot that posts the number of people positive for the new coronavirus in Tokyo to Slack