[PYTHON] Let's put out a ranking of the number of effective reproductions of the new coronavirus by prefecture

Introduction

In connection with the novel coronavirus infection (COVID-19), March 19, 2020, Situation analysis and recommendations by expert meeting .pdf) has been issued. Among them, the analysis results on the effective reproduction number (the average number of secondary infections produced by one infected person at a certain time in a population with an ongoing epidemic) are shown. In Hokkaido, the value has been below 1 since mid-February, and it was the view that it is heading toward convergence. In Article I wrote the other day, I showed the result calculated with a simplified model, but the analysis result at that time is also about 1 in the point that it is less than 1. We have done so, and I think we were able to match the answers. However, the analysis by Professor Nishiura of Hokkaido University seems to be a more precise analysis using maximum likelihood estimation. Probably, as shown in this paper, effective reproduction number , Rt): It seems to mean "the number of secondary infections by one infected person (at a certain time t, under certain measures)". Even though the basic reproduction number is R0, it seems difficult to evaluate it separately from society and policies, so the title of this article was intentionally set to the effective reproduction number. By the way, in this article, I would like to give a ranking of the average value of the number of reproductions by prefecture during a certain period. At the expert meeting the other day, ** 1. Areas where the infection situation is expanding 2. 2. Areas where the infection situation is beginning to converge and areas where the infection has been settled to a certain extent. It was recommended to divide the area into three areas, where the infection status has not been confirmed and **, in a well-balanced manner, but it was not clearly stated which area corresponds to 1. Therefore, the main motivation for this article is to ** specifically calculate the number of reproductions by prefecture and rank them **.

Premise

The basic calculation formula is the same as the content of Previous article. I haven't changed the parameters either. Also, as in the previous article, the csv published in Map of the number of new coronavirus infections by prefecture (provided by Jag Japan Co., Ltd.) I used the data. Due to the time lag of finding a positive test after the incubation period and infection period, the results before the last two weeks have not been obtained. Also, so that you can use Japanese with matplotlib, [this page](https://datumstudio.jp/blog/matplotlib%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA % 9E% E6% 96% 87% E5% AD% 97% E5% 8C% 96% E3% 81% 91% E3% 82% 92% E8% A7% A3% E6% B6% 88% E3% 81% 99 % E3% 82% 8Bwindows% E7% B7% A8) is used as a reference to install the IPAex Gothic font.

Try to calculate with Python

This code is available on GitHub. It is saved in Jupyter Notebook format. (File name: 03_R0_estimation-JPN-02a.ipynb)

code

I won't cover all of the code in the article because it will be long, but I will explain the key points in creating the ranking. It's a magic that allows you to use Japanese fonts in the figure (fonts need to be pre-installed).

font = {'family' : 'IPAexGothic'}
plt.rc('font', **font)

This is a function that extracts the prefecture name. I'm using the duplicated () function to remove duplicate values.

def getJapanPrefList():
    #Download from the URL below
    # https://jag-japan.com/covid19map-readme/
    fcsv = u'COVID-19.csv'
    df = pd.read_csv(fcsv, header=0, encoding='utf8', parse_dates=[u'Confirmed date YYYYMMDD'])
    #Fixed date,Extract only the prefectures where you receive a medical examination
    df1 = df.loc[:,[u'Consultation prefecture']]
    df1.columns = ['pref']
    df1 = df1[~df1.duplicated()]
    preflist = [e[0] for e in df1.values.tolist()]
    return preflist

preflist = getJapanPrefList()

This is the part that creates graphs and data frames for the prefecture list.

def R0inJapanPref2(pref):
    keys = {'lp':5, 'ip':8 }
    df1 = makeCalcFrame(60) # 60 days
    df2 = readCsvOfJapanPref(pref)
    df = mergeCalcFrame(df1, df2)
    df = calcR0(df, keys)
    showResult(df, u'COVID-19 R0 ({})'.format(pref))
    return df

dflist = [ [R0inJapanPref2(pref), pref] for pref in preflist]

A function that calculates the average of the basic reproduction number R0 over a specified period. We are dealing with the case where it becomes blank.

def calcR0Average(df, st, ed):
    df1 = df[(st <= df.date) & (df.date <= ed) ]
    df2 = df1[np.isnan(df1.R0) == False]
    df3 = df2['R0']
    ave = np.average(df3) if len(df3) > 0 else 0
    return ave

A function that sorts for ranking.

def calcR0AveRank(dflist, st, ed):
    R0AveRank = [ [pref, calcR0Average(df, st, ed)] for df, pref in dflist]
    R0AveRank.sort(key = lambda x: x[1], reverse=True)
    df = pd.DataFrame(R0AveRank)
    df.columns = ['pref','R0ave']
    return df

A function that displays the ranking as a bar graph. I'm using the set_position function to bring the X axis up.

def showRank(dflist, maxn, title):
    ax = dflist.iloc[0:maxn,:].plot.barh(y='R0ave',x='pref',figsize=(8,10))
    ax.invert_yaxis()
    ax.grid(True, axis='x')
    ax.spines['bottom'].set_position(('axes',1.05))
    plt.title(title, y=1.05)
    plt.show()
    return ax

Finally, the part that calculates the ranking.

# 2020/2/7 to 2020/3/7
st = pd.Timestamp(2020,2,7)
ed = pd.Timestamp(2020,3,7)
title = "R0ave_ranking_2020207_20200307"
dfR0_1 = calcR0AveRank(dflist, st, ed)
ax = showRank(dfR0_1, 13, title)
fig = ax.get_figure()
fig.savefig("{}.jpg ".format(title))
# 2020/2/23 to 2020/3/7
st = pd.Timestamp(2020,2,23)
ed = pd.Timestamp(2020,3,7)
title = "R0ave_ranking_2020223_20200307"
dfR0_1 = calcR0AveRank(dflist, st, ed)
ax = showRank(dfR0_1, 13, title)
fig = ax.get_figure()
fig.savefig("{}.jpg ".format(title))
# 2020/3/1 to 2020/3/7
st = pd.Timestamp(2020,3,1)
ed = pd.Timestamp(2020,3,7)
title = "R0ave_ranking_2020301_20200307"
dfR0_2 = calcR0AveRank(dflist, st, ed)
ax = showRank(dfR0_2, 13, title)
fig = ax.get_figure()
fig.savefig("{}.jpg ".format(title))

Ranking result

Now let's take a look at the calculation results. If $ R_0> 1 $, the infection is spreading, and if $ R_0 <1 $, the infection is converging. We will look at the latest one month, two weeks, and one week.

Ranking from 2020/2/7 to 2020/3/7 (for the last month)

R0ave_ranking_2020207_20200307.jpg

Ranking from 2020/2/23 to 2020/3/7 (for the last 2 weeks)

R0ave_ranking_2020223_20200307.jpg

Ranking from 2020/3/1 to 2020/3/7 (for the last week)

R0ave_ranking_2020301_20200307.jpg

Consideration

Furthermore ...

Reference link

I referred to the following page.

  1. "Situation analysis and recommendations for measures against new coronavirus infection" (March 19, 2020)
  2. Calculate the transition of the basic reproduction number of the new coronavirus by prefecture
  3. Prediction of infectious disease epidemics: Quantitative issues in infectious disease mathematical models
  4. Map of the number of people infected with the new coronavirus by prefecture (provided by Jag Japan Co., Ltd.)
  5. [Eliminate garbled Japanese characters in matplotlib (Windows)](https://datumstudio.jp/blog/matplotlib%E3%81%AE%E6%97%A5%E6%9C%AC%E8% AA% 9E% E6% 96% 87% E5% AD% 97% E5% 8C% 96% E3% 81% 91% E3% 82% 92% E8% A7% A3% E6% B6% 88% E3% 81% 99% E3% 82% 8Bwindows% E7% B7% A8)
  6. Saitama Prefecture Page

Recommended Posts

Let's put out a ranking of the number of effective reproductions of the new coronavirus by prefecture
Let's calculate the transition of the basic reproduction number of the new coronavirus by prefecture
Let's examine the convergence time from the global trend of the effective reproduction number of the new coronavirus
Let's test the medical collapse hypothesis of the new coronavirus
Let's visualize the number of people infected with coronavirus with matplotlib
I tried to tabulate the number of deaths per capita of COVID-19 (new coronavirus) by country
Let's simulate the effect of introducing a contact tracking app as a countermeasure against the new coronavirus
Plot the spread of the new coronavirus
Create a bot that posts the number of people positive for the new coronavirus in Tokyo to Slack
Did the number of store closures increase due to the impact of the new coronavirus?
Estimate the peak infectivity of the new coronavirus
Create a BOT that displays the number of infected people in the new corona
Scraping the number of downloads and positive registrations of the new coronavirus contact confirmation app
[Python] Plot data by prefecture on a map (number of cars owned nationwide)
Find the number of days in a month
Minimize the number of polishings by combinatorial optimization
Factfulness of the new coronavirus seen in Splunk
GUI simulation of the new coronavirus (SEIR model)
I tried to put out the frequent word ranking of LINE talk with Python
Verify the effect of leave as a countermeasure against the new coronavirus with the SEIR model
Find out the maximum number of characters in multi-line text stored in a data frame
[Python] A program that counts the number of valleys
Tasks at the start of a new python project
Let's visualize the rainfall data released by Shimane Prefecture
IDWR bulletin data scraping the number of reports per fixed point of influenza and by prefecture
Significance of narrowing down the test target of PCR test for new coronavirus understood by Bayes' theorem
Let's take a look at the infection tendency of the new coronavirus COVID-19 in each country and the medical response status (additional information).
If the people of Tokyo become seriously ill with the new coronavirus, they may be taken to a hospital in Kagoshima prefecture.