[PYTHON] Let's examine the convergence time from the global trend of the effective reproduction number of the new coronavirus

Introduction

In relation to the new coronavirus infection (COVID-19), the effective reproduction number [^ 1] is analyzed by prefecture, and [ranking] ](Https://qiita.com/oki_mebarun/items/a21dd3ebf03c64066d29), but this time, we will look at the world data and consider whether the convergence time can be predicted from the transition of the effective reproduction number. I saw it. In particular, since the Tokyo Olympics 2020 is on the verge of being held in July, it is hoped that the situation will be resolved as soon as possible not only in Japan but also in the world.

Conclusion

First of all, to briefly explain from the conclusion,

  1. In Europe, the infection peaked around March 21st, and the effective reproduction number may have reached $ R \ leq 1 $, but it is expected to be observed around the beginning of April. ..
  2. In the United States, Australia, Southeast Asia (Malaysia, Indonesia), and the Middle East (Turkey, Israel), $ R $ has remained at a high level around 10 and there is no tendency to converge, so the situation is unpredictable.
  3. Even in countries where expansion is relatively restrained (Taiwan, Hong Kong, Singapore, Japan), $ R $ may occasionally jump, so be careful of inflows from abroad.

Premise

The basic calculation formula is the same as the content of Previous article. I haven't changed the parameters either. In addition, we will use the New Coronavirus Dataset. I did. We pay tribute to the efforts provided with such public data. Due to the time lag of finding a positive test after the incubation period and infection period, the results before the last two weeks have not been obtained.

Try to calculate with Python

This code is available on GitHub. It is saved in Jupyter Notebook format. (File name: 03_R0_estimation-WLD-02b.ipynb)

code

In particular, there are not many changes because it follows the previous article. To put it bluntly, the difference is taken to change the cumulative value data into daily fixed data.

def readCsvOfWorldArea(area : None):
    #Download from the URL below
    # https://hackmd.io/@covid19-kenmo/dataset/https%3A%2F%2Fhackmd.io%2F%40covid19-kenmo%2Fdataset
    fcsv = u'World-COVID-19.csv'
    df = pd.read_csv(fcsv, header=0, encoding='sjis', parse_dates=[u'date'])
    #date,Extract target countries
    if area is not None:
        df1 = df.loc[:,[u'date',area]]
    else:
        df1 = df.loc[:,[u'date',u'Infected people throughout the world']]        
    df1.columns = ['date','Psum']
    ##Cumulative ⇒ daily conversion
    df2 = df1.copy()
    df2.columns = ['date','P']
    df2.iloc[0,1] = 0
    ##Character string ⇒ numerical value
    getFloat = lambda e: float('{}'.format(e).replace(',',''))
    ##Difference calculation
    for i in range(1,len(df1)):
        df2.iloc[i, 1] = getFloat(df1.iloc[i, 1]) - getFloat(df1.iloc[i-1, 1] )
    ##
    return df2

A moving average has been added to the R calculation process. The average is taken for 3 days before and after.

def calcR0(df, keys):
    lp = keys['lp']
    ip = keys['ip']
    nrow = len(df)
    getP = lambda s: df.loc[s, 'P'] if s < nrow else np.NaN
    getP2 = lambda s: np.average([ getP(s + r) for r in range(-1,2)])
    for t in range(1, nrow):
        df.loc[t, 'Ppre'] = sum([ getP2(s) for s in range(t+1, t + ip + 1)])
        df.loc[t, 'Pat' ] = getP2(t + lp + ip)
        if df.loc[t, 'Ppre'] > 0:
            df.loc[t, 'R0'  ] = ip * df.loc[t, 'Pat'] / df.loc[t, 'Ppre']
        else:
            df.loc[t, 'R0'  ] = np.NaN
    return df

Also, to make the axes easier to see, they are displayed on logarithmic axes.

def showResult3(dflist, title):
    # R0=1
    dfs = dflist[0][0]
    ptgt = pd.DataFrame([[dfs.iloc[0,0],1],[dfs.iloc[len(dfs)-1,0],1]])
    ptgt.columns = ['date','target']
    ax = ptgt.plot(title='COVID-19 R0', x='date',y='target',style='r--', figsize=(10,8))
    ax.set_yscale("symlog", linthreshy=1)
    #
    for df, label in dflist:
        showResult2(ax, df, label)
    #
    ax.grid(True)
    ax.set_ylim(0,)
    plt.show()
    fig = ax.get_figure()
    fig.savefig("R0_{}.png ".format(title))

I was able to handle it without changing the original code so much, which was helpful.

Calculation result

Now let's take a look at the calculation results. If $ R_0> 1 $, the infection is spreading, and if $ R_0 <1 $, the infection is converging.

Area where explosive infection was observed

Here are the results for mainland China, Italy, the United States, Spain, Iran and South Korea. R0_爆発的感染が観測された地域.png

Europe

Here is the result of collecting countries with many infected people in Europe including Italy. R0_ヨーロッパ.png

Areas where infection is relatively suppressed around Asia

Here are the results for Taiwan, Japan, Hong Kong and Singapore. R0_アジア周辺で比較的感染が抑制されている地域.png

Areas where there is concern about the spread of infection in the future

Looking at the graph, if not all, here is the result of collecting countries where $ R $ is moving at a high level and there is no tendency to converge. R0_今後感染拡大が懸念される地域.png

Let's draw an approximate expression of the effective reproduction number based on the result of Europe.

Looking at the changes in the number of effective reproductions, we can see that after a sharp increase, it tends to decrease exponentially. In particular, looking at the results in Europe, we see a similar convergence trend regardless of country. Therefore, I applied it with the following approximation formula.

R(t) = R(t_0) \cdot 2^{-\frac{t-t_0}{T}}

In other words, the half-life of $ R (t) $ is $ T $. In fact, if you set $ T = 7.5 [days] $ and match it with the graph of the European region, it will be as follows (the dotted line in the figure is the estimation formula). R0_ヨーロッパ+推定.png

From here, if you specifically substitute the date for $ R (t) $,

The result is. Of course, it is an approximation, so it may not be the case. However, if $ R <1 $ was reached on March 21st, a trend that the increase in new infections would be stable should be observed around April 4th, 13 days later. .. If so, the number of inpatients will decrease steadily and convergence will be seen.

Also, here is the result of applying the above approximation formula to other regions.

Area where explosive infection was observed

R0_爆発的感染が観測された地域+推定.png

Areas where there is concern about the spread of infection in the future

R0_今後感染拡大が懸念される地域+推定.png

Furthermore ...

Reference link

I referred to the following page.

  1. New Coronavirus Dataset
  2. "Situation analysis and recommendations for measures against new coronavirus infection" (March 19, 2020)
  3. Calculate the transition of the basic reproduction number of the new coronavirus by prefecture
  4. Ranking the number of effective reproductions of the new coronavirus by prefecture

[^ 1]: In this article, we define it as the number of secondary infections by one infected person (at a certain time t, under certain measures).

Recommended Posts

Let's examine the convergence time from the global trend of the effective reproduction number of the new coronavirus
Let's calculate the transition of the basic reproduction number of the new coronavirus by prefecture
Let's put out a ranking of the number of effective reproductions of the new coronavirus by prefecture
Let's test the medical collapse hypothesis of the new coronavirus
Examine the margin of error in the number of deaths from pneumonia
Let's visualize the number of people infected with coronavirus with matplotlib
Plot the spread of the new coronavirus
Did the number of store closures increase due to the impact of the new coronavirus?
Estimate the peak infectivity of the new coronavirus
Scraping the number of downloads and positive registrations of the new coronavirus contact confirmation app
Factfulness of the new coronavirus seen in Splunk
GUI simulation of the new coronavirus (SEIR model)
Calculation of the minimum required number of votes from turnout
Visualize the number of complaints from life insurance companies
I tried to tabulate the number of deaths per capita of COVID-19 (new coronavirus) by country
[Introduction to logarithmic graph] Predict the end time of each country from the logarithmic graph of infection number data ♬
Let's simulate the effect of introducing a contact tracking app as a countermeasure against the new coronavirus
I tried to find the trend of the number of ships in Tokyo Bay from satellite images.