Introduction

In connection with the novel coronavirus infection (COVID-19), March 19, 2020, Situation analysis and recommendations by expert meeting .pdf) has been issued. Among them, the analysis results on the effective reproduction number (the average number of secondary infections produced by one infected person at a certain time in a population with an ongoing epidemic) are shown. In Hokkaido, the value has been below 1 since mid-February, and it was the view that it is heading toward convergence. In Article I wrote the other day, I showed the result calculated with a simplified model, but the analysis result at that time is also about 1 in the point that it is less than 1. We have done so, and I think we were able to match the answers. However, the analysis by Professor Nishiura of Hokkaido University seems to be a more precise analysis using maximum likelihood estimation. Probably, as shown in this paper, effective reproduction number , Rt): It seems to mean "the number of secondary infections by one infected person (at a certain time t, under certain measures)". Even though the basic reproduction number is R0, it seems difficult to evaluate it separately from society and policies, so the title of this article was intentionally set to the effective reproduction number. By the way, in this article, I would like to give a ranking of the average value of the number of reproductions by prefecture during a certain period. At the expert meeting the other day, ** 1. Areas where the infection situation is expanding 2. 2. Areas where the infection situation is beginning to converge and areas where the infection has been settled to a certain extent. It was recommended to divide the area into three areas, where the infection status has not been confirmed and **, in a well-balanced manner, but it was not clearly stated which area corresponds to 1. Therefore, the main motivation for this article is to ** specifically calculate the number of reproductions by prefecture and rank them **.

Premise

The basic calculation formula is the same as the content of Previous article. I haven't changed the parameters either. Also, as in the previous article, the csv published in Map of the number of new coronavirus infections by prefecture (provided by Jag Japan Co., Ltd.) I used the data. Due to the time lag of finding a positive test after the incubation period and infection period, the results before the last two weeks have not been obtained. Also, so that you can use Japanese with matplotlib, [this page](https://datumstudio.jp/blog/matplotlib%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA % 9E% E6% 96% 87% E5% AD% 97% E5% 8C% 96% E3% 81% 91% E3% 82% 92% E8% A7% A3% E6% B6% 88% E3% 81% 99 % E3% 82% 8Bwindows% E7% B7% A8) is used as a reference to install the IPAex Gothic font.

Try to calculate with Python

This code is available on GitHub. It is saved in Jupyter Notebook format. (File name: 03_R0_estimation-JPN-02a.ipynb)

GitHub: okimebarun / 01_COVID19_analysis

code

I won't cover all of the code in the article because it will be long, but I will explain the key points in creating the ranking. It's a magic that allows you to use Japanese fonts in the figure (fonts need to be pre-installed).

font = {'family' : 'IPAexGothic'}
plt.rc('font', **font)

This is a function that extracts the prefecture name. I'm using the duplicated () function to remove duplicate values.

def getJapanPrefList():
    #Download from the URL below
    # https://jag-japan.com/covid19map-readme/
    fcsv = u'COVID-19.csv'
    df = pd.read_csv(fcsv, header=0, encoding='utf8', parse_dates=[u'Confirmed date YYYYMMDD'])
    #Fixed date,Extract only the prefectures where you receive a medical examination
    df1 = df.loc[:,[u'Consultation prefecture']]
    df1.columns = ['pref']
    df1 = df1[~df1.duplicated()]
    preflist = [e[0] for e in df1.values.tolist()]
    return preflist

preflist = getJapanPrefList()

This is the part that creates graphs and data frames for the prefecture list.

def R0inJapanPref2(pref):
    keys = {'lp':5, 'ip':8 }
    df1 = makeCalcFrame(60) # 60 days
    df2 = readCsvOfJapanPref(pref)
    df = mergeCalcFrame(df1, df2)
    df = calcR0(df, keys)
    showResult(df, u'COVID-19 R0 ({})'.format(pref))
    return df

dflist = [ [R0inJapanPref2(pref), pref] for pref in preflist]

A function that calculates the average of the basic reproduction number R0 over a specified period. We are dealing with the case where it becomes blank.

def calcR0Average(df, st, ed):
    df1 = df[(st <= df.date) & (df.date <= ed) ]
    df2 = df1[np.isnan(df1.R0) == False]
    df3 = df2['R0']
    ave = np.average(df3) if len(df3) > 0 else 0
    return ave

A function that sorts for ranking.

def calcR0AveRank(dflist, st, ed):
    R0AveRank = [ [pref, calcR0Average(df, st, ed)] for df, pref in dflist]
    R0AveRank.sort(key = lambda x: x[1], reverse=True)
    df = pd.DataFrame(R0AveRank)
    df.columns = ['pref','R0ave']
    return df

A function that displays the ranking as a bar graph. I'm using the set_position function to bring the X axis up.

def showRank(dflist, maxn, title):
    ax = dflist.iloc[0:maxn,:].plot.barh(y='R0ave',x='pref',figsize=(8,10))
    ax.invert_yaxis()
    ax.grid(True, axis='x')
    ax.spines['bottom'].set_position(('axes',1.05))
    plt.title(title, y=1.05)
    plt.show()
    return ax

Finally, the part that calculates the ranking.

# 2020/2/7 to 2020/3/7
st = pd.Timestamp(2020,2,7)
ed = pd.Timestamp(2020,3,7)
title = "R0ave_ranking_2020207_20200307"
dfR0_1 = calcR0AveRank(dflist, st, ed)
ax = showRank(dfR0_1, 13, title)
fig = ax.get_figure()
fig.savefig("{}.jpg ".format(title))
# 2020/2/23 to 2020/3/7
st = pd.Timestamp(2020,2,23)
ed = pd.Timestamp(2020,3,7)
title = "R0ave_ranking_2020223_20200307"
dfR0_1 = calcR0AveRank(dflist, st, ed)
ax = showRank(dfR0_1, 13, title)
fig = ax.get_figure()
fig.savefig("{}.jpg ".format(title))
# 2020/3/1 to 2020/3/7
st = pd.Timestamp(2020,3,1)
ed = pd.Timestamp(2020,3,7)
title = "R0ave_ranking_2020301_20200307"
dfR0_2 = calcR0AveRank(dflist, st, ed)
ax = showRank(dfR0_2, 13, title)
fig = ax.get_figure()
fig.savefig("{}.jpg ".format(title))

Ranking result

Now let's take a look at the calculation results. If $ R_0> 1 $, the infection is spreading, and if $ R_0 <1 $, the infection is converging. We will look at the latest one month, two weeks, and one week.

Ranking from 2020/2/7 to 2020/3/7 (for the last month)

Hyogo Prefecture is at the top. As reported, it seems that clusters occurred at multiple locations such as day care facilities, children's gardens, and hospitals at the same time.
Next, in Hokkaido, it seems that this includes the period of the Sapporo Snow Festival (February 4th to February 11th).
3rd place is Osaka prefecture. This is also probably due to the fact that it includes a live house cluster (February 15th to February 24th) as reported. Since Hyogo prefecture (1st place) and Osaka prefecture (3rd place) are adjacent to each other, there is concern that a cluster chain will occur due to traffic.

Ranking from 2020/2/23 to 2020/3/7 (for the last 2 weeks)

Hyogo Prefecture is at the top. The reason will be the same as before.
Surprisingly, Gunma Prefecture came in second. It seems that this is because when the number of infected people starts to increase recently, it tends to be higher. As of March 20, the number of infected people in Gunma Prefecture is 11, but it seems that there are reports that clusters are occurring at clinics in the prefecture.
3rd place is Saitama prefecture. Looking at the Saitama Prefecture Page, it seems that the route of infected people is generally being followed. As of March 20, the number of infected people in Saitama Prefecture is 44.

Ranking from 2020/3/1 to 2020/3/7 (for the last week)

Gunma Prefecture is at the top. The reason will be the same as before.
Next is Shiga prefecture. As of March 20, the number of infected people in Shiga Prefecture is 4, so it seems that this is due to a very recent outbreak.
3rd place is Tokyo. Looking at it in the last month, it was about 1.5, but in the last week it was about 2.2, and I'm worried that it is on a modest increase. The increase in returnees may also be due to the tightening of travel restrictions by the Ministry of Foreign Affairs (China, South Korea, the EU, and the world).

Consideration

The number of prefectures where the number of infected people is increasing ($ R_0> 1 $) is 13 in the last month, 11 in the last 2 weeks, and 12 in the last week. is. There seems to be a tendency for the top group to be + $ R_0> 4 $ and the middle group to be $ 1 <R_0 <3 $.
In the adjacent metropolitan area, there are prefectures where $ R_0> 1 $, especially the links of Hyogo-Osaka-Shiga, Kanagawa-Tokyo-Saitama-Chiba-Gunma.

Furthermore ...

Like Professor Nishiura, it may be better to analyze returnees and residents separately. I'm a little skeptical about arithmetic averaging + $ R_0 $, so I may need to devise a little more calculation method. I compromised on the arithmetic mean because the geometric mean can be 0.
COVID-19 (SARS-CoV-2) is characterized by a long incubation period and infection period, but there is a time lag before positive detection, and measures such as whac-a-mole are continuously exhausted. I am concerned about that. It would be more efficient to strengthen cluster alert measures based on the ranking of areas with high + $ R_0 $, to refrain from doing so, and to concentrate resources.

Reference link

I referred to the following page.

"Situation analysis and recommendations for measures against new coronavirus infection" (March 19, 2020)
Calculate the transition of the basic reproduction number of the new coronavirus by prefecture
Prediction of infectious disease epidemics: Quantitative issues in infectious disease mathematical models
Map of the number of people infected with the new coronavirus by prefecture (provided by Jag Japan Co., Ltd.)
[Eliminate garbled Japanese characters in matplotlib (Windows)](https://datumstudio.jp/blog/matplotlib%E3%81%AE%E6%97%A5%E6%9C%AC%E8% AA% 9E% E6% 96% 87% E5% AD% 97% E5% 8C% 96% E3% 81% 91% E3% 82% 92% E8% A7% A3% E6% B6% 88% E3% 81% 99% E3% 82% 8Bwindows% E7% B7% A8)
Saitama Prefecture Page

[PYTHON] Let's put out a ranking of the number of effective reproductions of the new coronavirus by prefecture