[PYTHON] While studying pandas, I did a free study on when corona would end and whether lockdown was meaningful.

Corona will end.

It's been a month since the state of emergency for the coronavirus was declared. Since I was working in Europe for research, I have been suffering from this virus since the middle of March. Jobs in Europe are no longer in this turmoil, and even though I'm looking for a job in Japan, I'm having a hard time getting a reply from everyone. (Thank you for the fact that it's GW and it's natural.) It's a waste of time to spend time at home, so I did some free research to see if the corona would end, as well as studying pandas.

What to use

Our world in data corona database (CSV based)

Only this was easy and I couldn't find a good guy. Please teach if you have a good database. https://github.com/owid/covid-19-data/blob/master/public/data/owid-covid-data.csv

Wikipedia lockdown page

https://en.wikipedia.org/wiki/Curfews_and_lockdowns_related_to_the_2019%E2%80%9320_coronavirus_pandemic#cite_note-51

Jupyter and pandas

Since this area is Mr. qiita, I will omit the introduction.

Data preparation

Cook the Wikipedia page in csv first.

It's really hard. Screen Shot 2020-05-03 at 15.46.51.png Read csv and organize data by country

df=pd.read_csv('/home/username/COVID19/owid-covid-data.csv')
df.date = pd.to_datetime(df['date'], format='%Y-%m-%d')
countries=pd.read_csv('/home/username/COVID19/countries.csv')
countries.lockdown_begins=pd.to_datetime(countries['lockdown_begins'], format='%Y-%m-%d')
countries.lockdown_ends=pd.to_datetime(countries['lockdown_ends'], format='%Y-%m-%d')
countries=countries.dropna()
countries.index=countries.iso_code

When I enter pandas, it looks like this (I messed with the index a little) Wiki (Lockdown start date, end (planned) date) Screen Shot 2020-05-03 at 15.54.02.png Raw data Screen Shot 2020-05-03 at 15.59.33.png

Take a subset of the data by country and index the date

def country(df,country_name):
    
    newdata=df.loc[df.iso_code==country_name]
    newdata.set_index(['date'],inplace=True)
    
    return newdata

Now let's play with the data.

Anything is fine so let's plot

So, let's make a plot of the number of confirmed infections, the date and time of death, and the cumulative total, which we often see. The 7-day moving average seems to be a good way to avoid Saturday and Sunday data. Let's calculate the 7-day moving average and the positive rate.

def rolling_average(input):
    columns=input.columns
    
    input['new_tests2']=input['total_tests'].diff()

    for column in columns[2:13]:
        #Is the moving average ok on this line?
        input[column]=input[column].rolling(7).mean()
   #From the moving average of all cases and the number of tests (number of people), the cumulative positive rate and daily positive rate are calculated.
    input['positive_rate_total']=input['total_cases']/input['total_tests']
    input['positive_rate_diff']=input['new_cases']/input['new_tests2']

    return input

test_jpn.png The number of newly infected people on the moving average is declining, and the number of deaths is likely to exceed the pass. That's exactly what the news says. (As of May 3) As reported, the daily PCR positive rate seems to be around 10% for the whole country. If the data is only for Tokyo, it will be quite different.

Main subject 1 What is the effect of lockdown?

Now that you've reproduced the news on jupyter, let's look at its relationship to lockdown. I've been watching it all the time (I checked every day for the first three weeks in Europe), a free Financial Times article for now Let's plot based on the day when the moving average of the dead became 3 people with reference to. This is the reference date for the spread of coronavirus in this independent study. All plots are divided by the maximum value (the value with the maximum number of infected or dead). japan.png The day after the state of emergency was announced (4/6), 4/7 was the day when the number of people became three, and Japan was on the dead graph of the Financial Times, which had not been on this day. It appeared and it was extremely painful. I imagined a situation in which an infection explosion would kill tens of thousands of people in Japan, and I trembled. Fortunately, however, I think it was around April 17th that I was a little relieved after passing the peak of newly infected people in less than two weeks. It was like that, right? I can't say anything about the peak of deaths, but let's have a wishful thinking that last week has passed, as the same graph in other countries is about two weeks behind. (The number of deaths seems to vary depending on the medical situation in the country.) And we will be on 5/4 of the 25th day. Cancellation of the state of emergency at the end of the month is on the far right of the graph. At first glance, the number of new confirmed infections seems to settle down. Of course, you can make this figure from various countries. That said, the content is the same as the Financial Times, so let's take a look at Britain and France, which have been indebted to us, and Australia, which is doing well in quite developed countries. gbr.png In the UK, eight days after the moving average of three deaths had passed, the number of infected and dead people continued to increase for the next two weeks. It's very painful for me to be taken care of. At this stage, of course, there is no chance of unlocking the lockdown. By the way, the rightmost line is today. sad. fra.png France was late in the lockdown, but by imposing a strict ban on going out and fines for unnecessary going out, it forced a significant restriction on the movement of people. The effect is very good now, and in terms of numbers, about 1,000 people are still confirmed to be infected every day, but the number of newly infected people is decreasing to 20% in the worst case, and the number of deaths is decreasing. At that time, I was confused by the strict prohibition of going out, but now I am getting a very high effect. The rightmost line is today's date. aus.png Australia is not very familiar with the situation (did it not lock down?), But it seems that it stopped operating international flights from a very early stage. (That day is used instead of the lockdown start date.) As you can see, the effect is extremely high in Australia, where measures were taken before the deaths occurred, and the peak number of infected people ended before the death toll reached three. .. That means that at the time of the three dead, it has been cleared up. The line on the right is 5/3 days. In this case, if you go out carefully and avoid the clusters (three dense), you will be able to return to your original life by the end of this year.

how was it? It's a simple graph, but I feel that you can see the importance of lockdown and border protection.

Main subject 2 Disgusting graph

Apparently, the speed of lockdown and border measures was the key to corona measures. Calling this a cool name, Lockdown Delta, I tried various plots of the number of deaths and infections in the top 50 countries with the highest cumulative number of infections, and there were only two interesting figures. The first one is here. It's a very disgusting graph. After all, the vertical axis is the people we lost. Daily_max_death_vs_lockdown_speed.png (I say lockdown efficiency, but there is a difference between the day when I just locked down and the cumulative total of 3 dead. It's a lockdown delta. Lol) The lockdown delta locked down faster as it went to the left and slower as it went to the right. Indicates that. The moving average peak death toll is taken vertically. As you can see, there is a clear correlation without machine learning or drawing a line. Japan has a lockdown delta of -1, so I think it's a pretty good one among other countries. The cluster from the end of February, the effect of avoiding three stakes may have come out. There is some debate about issuing a state of emergency a week earlier, but it can't be helped.