In connection with the novel coronavirus infection (COVID-19), March 19, 2020, Situation analysis and recommendations by expert meeting .pdf) has been issued. Among them, the analysis results on the effective reproduction number (the average number of secondary infections produced by one infected person at a certain time in a population with an ongoing epidemic) are shown. In Hokkaido, the value has been below 1 since mid-February, and it was the view that it is heading toward convergence. In Article I wrote the other day, I showed the result calculated with a simplified model, but the analysis result at that time is also about 1 in the point that it is less than 1. We have done so, and I think we were able to match the answers. However, the analysis by Professor Nishiura of Hokkaido University seems to be a more precise analysis using maximum likelihood estimation. Probably, as shown in this paper, effective reproduction number , Rt): It seems to mean "the number of secondary infections by one infected person (at a certain time t, under certain measures)". Even though the basic reproduction number is R0, it seems difficult to evaluate it separately from society and policies, so the title of this article was intentionally set to the effective reproduction number. By the way, in this article, I would like to give a ranking of the average value of the number of reproductions by prefecture during a certain period. At the expert meeting the other day, ** 1. Areas where the infection situation is expanding 2. 2. Areas where the infection situation is beginning to converge and areas where the infection has been settled to a certain extent. It was recommended to divide the area into three areas, where the infection status has not been confirmed and **, in a well-balanced manner, but it was not clearly stated which area corresponds to 1. Therefore, the main motivation for this article is to ** specifically calculate the number of reproductions by prefecture and rank them **.
The basic calculation formula is the same as the content of Previous article. I haven't changed the parameters either. Also, as in the previous article, the csv published in Map of the number of new coronavirus infections by prefecture (provided by Jag Japan Co., Ltd.) I used the data. Due to the time lag of finding a positive test after the incubation period and infection period, the results before the last two weeks have not been obtained. Also, so that you can use Japanese with matplotlib, [this page](https://datumstudio.jp/blog/matplotlib%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA % 9E% E6% 96% 87% E5% AD% 97% E5% 8C% 96% E3% 81% 91% E3% 82% 92% E8% A7% A3% E6% B6% 88% E3% 81% 99 % E3% 82% 8Bwindows% E7% B7% A8) is used as a reference to install the IPAex Gothic font.
This code is available on GitHub. It is saved in Jupyter Notebook format. (File name: 03_R0_estimation-JPN-02a.ipynb)
I won't cover all of the code in the article because it will be long, but I will explain the key points in creating the ranking. It's a magic that allows you to use Japanese fonts in the figure (fonts need to be pre-installed).
font = {'family' : 'IPAexGothic'}
plt.rc('font', **font)
This is a function that extracts the prefecture name. I'm using the duplicated () function to remove duplicate values.
def getJapanPrefList():
#Download from the URL below
# https://jag-japan.com/covid19map-readme/
fcsv = u'COVID-19.csv'
df = pd.read_csv(fcsv, header=0, encoding='utf8', parse_dates=[u'Confirmed date YYYYMMDD'])
#Fixed date,Extract only the prefectures where you receive a medical examination
df1 = df.loc[:,[u'Consultation prefecture']]
df1.columns = ['pref']
df1 = df1[~df1.duplicated()]
preflist = [e[0] for e in df1.values.tolist()]
return preflist
preflist = getJapanPrefList()
This is the part that creates graphs and data frames for the prefecture list.
def R0inJapanPref2(pref):
keys = {'lp':5, 'ip':8 }
df1 = makeCalcFrame(60) # 60 days
df2 = readCsvOfJapanPref(pref)
df = mergeCalcFrame(df1, df2)
df = calcR0(df, keys)
showResult(df, u'COVID-19 R0 ({})'.format(pref))
return df
dflist = [ [R0inJapanPref2(pref), pref] for pref in preflist]
A function that calculates the average of the basic reproduction number R0 over a specified period. We are dealing with the case where it becomes blank.
def calcR0Average(df, st, ed):
df1 = df[(st <= df.date) & (df.date <= ed) ]
df2 = df1[np.isnan(df1.R0) == False]
df3 = df2['R0']
ave = np.average(df3) if len(df3) > 0 else 0
return ave
A function that sorts for ranking.
def calcR0AveRank(dflist, st, ed):
R0AveRank = [ [pref, calcR0Average(df, st, ed)] for df, pref in dflist]
R0AveRank.sort(key = lambda x: x[1], reverse=True)
df = pd.DataFrame(R0AveRank)
df.columns = ['pref','R0ave']
return df
A function that displays the ranking as a bar graph. I'm using the set_position function to bring the X axis up.
def showRank(dflist, maxn, title):
ax = dflist.iloc[0:maxn,:].plot.barh(y='R0ave',x='pref',figsize=(8,10))
ax.invert_yaxis()
ax.grid(True, axis='x')
ax.spines['bottom'].set_position(('axes',1.05))
plt.title(title, y=1.05)
plt.show()
return ax
Finally, the part that calculates the ranking.
# 2020/2/7 to 2020/3/7
st = pd.Timestamp(2020,2,7)
ed = pd.Timestamp(2020,3,7)
title = "R0ave_ranking_2020207_20200307"
dfR0_1 = calcR0AveRank(dflist, st, ed)
ax = showRank(dfR0_1, 13, title)
fig = ax.get_figure()
fig.savefig("{}.jpg ".format(title))
# 2020/2/23 to 2020/3/7
st = pd.Timestamp(2020,2,23)
ed = pd.Timestamp(2020,3,7)
title = "R0ave_ranking_2020223_20200307"
dfR0_1 = calcR0AveRank(dflist, st, ed)
ax = showRank(dfR0_1, 13, title)
fig = ax.get_figure()
fig.savefig("{}.jpg ".format(title))
# 2020/3/1 to 2020/3/7
st = pd.Timestamp(2020,3,1)
ed = pd.Timestamp(2020,3,7)
title = "R0ave_ranking_2020301_20200307"
dfR0_2 = calcR0AveRank(dflist, st, ed)
ax = showRank(dfR0_2, 13, title)
fig = ax.get_figure()
fig.savefig("{}.jpg ".format(title))
Now let's take a look at the calculation results. If $ R_0> 1 $, the infection is spreading, and if $ R_0 <1 $, the infection is converging. We will look at the latest one month, two weeks, and one week.
I referred to the following page.