[PYTHON] [SIR model analysis] Peak out of infections in various parts of Japan ♬

I was looking for raw data, but I finally found the page of the Ministry of Health, Labor and Welfare, so I immediately analyzed it. This time too, the calculation is based on the SIR model and is calculated by an amateur ・ Interpretation. Please make your own judgment at your own risk. Since the data is provided in pdf, I had to read it, so I had a little trouble. I will refer to yesterday's material. I used the pdf of reference ② linked from here. 【reference】 ① Current status of new coronavirus infection (April 19, 2nd year of Reiwa) @ Ministry of Health, Labor and Welfare (2) Number of patient reports by prefecture in domestic cases (posted on April 19, 2020)

What i did

·Data processing ・ Code explanation ・ Situation in Japan and Tokyo ・ Other areas of concern

·Data processing

The procedure is as follows ・ Copy and paste from the above pdf to Notepad -Changed the format from "" delimiter to "," -Changed the extension from txt to csv ・ Sort by city in ascending order, and check the missing chucks. Depending on the date, the prefecture name may or may not be included. ・ Molding and creating 1 data (1 day's worth) ・ The above daily data is read in a batch with the following program and output to 3 csv. The three are confirmed, recovered, and deaths files, and the composition is the city name vertically and the date on the horizontal axis. ・ The following 3 files are placed. COVID-19_Japan/data/

・ Code explanation

The program that reads one day at a time, adds it to three files and outputs it is placed below. COVID-19_Japan/test_pd.py The output of one file is explained below. It's almost like a pandas application problem program.

import pandas as pd

test0 = pd.read_csv('COVID-19/csse_covid_19_data/japan/test_confirmed.csv') #,encoding="cp932")
day_list={326,327,328,329,331,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418}

First, read the original file test0 and add it to this file. Whether or not to add encoding = "cp932" when reading is necessary at first, but it seems that it is not necessary (an error will occur) once the file is created. Here, the file to be read by day_list is shown, and 401.csv etc. are read below.

for day in day_list:
    #original data input
    data = pd.read_csv('COVID-19/csse_covid_19_data/japan/{}.csv'.format(day),encoding="cp932")
    data.to_csv('COVID-19/csse_covid_19_data/japan/tokyo_confirmed.csv',  columns=['Region','cases'], index=False)

This code reads the files one by one and stores only the relevant columns = ['Region','cases'] in tokyo_confirmed.csv.

    test0_ = pd.read_csv('COVID-19/csse_covid_19_data/japan/tokyo_confirmed.csv')

Then rename the data and load it into test0_. Now, only the relevant data is read from data to test0_.

    #Add column
    s=str(day)

    test0['{}'.format(s)] = test0_['cases']

    test0.to_csv('COVID-19/csse_covid_19_data/japan/test_confirmed.csv',index=False)

・ Situation in Japan and Tokyo

This time the analysis application is also changed a little like a link. That is, the bar graph now shows new infected people on a daily basis. ・ COVID-19_Japan / fitting_japan.py

・ Situation in Japan

This is the same as we saw in the world situation, but we will look at it this time as well. After all, it is likely to reach around 20,000 in the next two weeks. exterpolate_総計_gamma_R_2.png The peak number of infections is likely to come in about a week, but the infection rate is declining every Saturday and Sunday, and it is unpredictable whether it will remain at 0. removed_総計_gammaR_2_II.png If you look at the graph below, the blue plot is the so-called effective reproduction number, which does not decrease at all around 10. Originally, this value becomes 1 and it goes to the end. For some reason, the number of cures in Japan has not increased. It seems that this has caused $ \ gamma $ to fall and the above infection rates to fall. After all, I don't think it will end unless the number of cures increases in the future. Otherwise, there will not be an infinite number of beds, which may lead to medical collapse somewhere. In that sense, I / (R + D) in the above figure is high and dangerous. removed_総計_gamma_R_2.png

・ Situation in Tokyo

I think this is the first painting in Japan. The trend in Tokyo is almost the same as the trend in Japan. exterpolate_東京_gamma_R_2.png After all, like the whole country, it seems to reach the peak soon, but since the cause is $ \ gamma $, it will not be possible to go to 0. removed_東京_gammaR_2_II.png Tokyo is worse than the whole country and the number of effective reproductions is close to 100. I wonder why the number of cures (about 5%) does not increase at all. If there is no cure, medical care will definitely collapse. The number of infections tends to be a little saturated, but looking at the graph above, it seems that 10,000 people are approaching two weeks later.

・ Other areas of concern

·Osaka

The number of infected people in Osaka has exceeded 1000 and is still increasing. And it seems that there will be about 2000 people in two weeks. exterpolate_大阪_gamma_R_2.png However, the peak number of infections is likely to come in about a week, so it is on the verge of becoming saturated. removed_大阪_gammaR_2_II.png Unlike Tokyo, the number of cures has started to increase a little, and I / (R + D) is likely to decrease. However, the number of effective reproductions is likely to be about 7. Therefore, it can be said that the situation is unpredictable. removed_大阪_gamma_R_2.png

・ Kanagawa

The rate of increase has decreased. However, 1000 people are just around the corner. However, the number of cures is likely to increase. I / (R + D) is also likely to decrease. exterpolate_神奈川_gamma_R_2.png It seems that the peak will come, but the situation is unpredictable. removed_神奈川_gammaR_2_II.png The number of effective reproductions is still over 10, and the reason why the infection rate is decreasing is because $ \ gamma $ is decreasing, and it is unpredictable whether it can be terminated. removed_神奈川_gamma_R_2.png

・ Chiba

It is sloppy and no peak of infection is visible. It seems to be a very dangerous situation with about 1000 people in a week. There is a sign that the number of cures is increasing, and it is likely to improve if it becomes genuine. exterpolate_千葉_gamma_R_2.png removed_千葉_gammaR_2_II.png removed_千葉_gamma_R_2.png

・ Saitama

Although the rate of increase has decreased, it is more inclined than Chiba and is likely to reach 1,000 in a week. The number of new infections is increasing every day, so the situation is unpredictable. exterpolate_埼玉_gamma_R_2.png removed_埼玉_gammaR_2_II.png removed_埼玉_gamma_R_2.png

·Okinawa

It is a dangerous situation because the number of infections increased rapidly two weeks ago and the slope is large. exterpolate_沖縄_gamma_R_2.png removed_沖縄_gammaR_2_II.png removed_沖縄_gamma_R_2.png

·Hokkaido

In Hokkaido, the number of infections seems to have increased sharply two weeks ago. Of course, it will not reach its peak yet. It is necessary to keep an eye on it as in Okinawa. exterpolate_北海道_gamma_R_2.png removed_北海道_gammaR_2_II.png removed_北海道_gamma_R_2.png

Summary

・ Japan suddenly entered the second stage in late March, and about three weeks have passed, so it is not yet the end. ・ In Tokyo, there is a situation where the number of infections is likely to peak. However, the number of cures is extremely small, and it is a worrying situation that medical collapse is likely to occur as it is. ・ It seems that the infection is about to spread in Hokkaido and Okinawa, and the situation is unpredictable. ・ The peak number of infections has not yet been seen in other areas.

・ After all, I would like to solve the differential equation again and see the overall infection transmission.

Recommended Posts

[SIR model analysis] Peak out of infections in various parts of Japan ♬
[SIR model analysis] Peak out of infection numbers in Japan and around the world ♬
[SIR model analysis] Peak out of infection numbers in Japan and around the world ♬ Part 2
[SIR model analysis] Determination of γ * (R-1) and peak out of infection number ♬ World edition
Old openssl causes problems in various parts of python
[Introduction to infectious disease model] All parts of Japan are ending ... ♬
Summary of various operations in Tensorflow
Data handling 2 Analysis of various data formats
2. Multivariate analysis spelled out in Python 8-2. K-nearest neighbor method [Weighting method] [Regression model]