# Let's analyze Covid-19 (Corona) data using Python [For beginners]

There was a lecture video on overseas youtube that uses Python to analyze data using data such as the number of infected people of Covid-19 (Corona), so I tried it.

## Target

Especially recommended for Python beginners who want to get used to Pandas and want to do data analysis. The lecture is in English, but it is a very simple and polite explanation, so please take a look.

--I practiced the video while commenting in Japanese on the Jupyter notebook. It is recommended that you practice while actually watching the video, but for those who are not good at English or who want to understand the flow of data analysis, I wrote it so that you can get an image just by reading this article. --Although we use real data (* source will be described later), the data analysis process is not aimed at particularly sharp analysis results, but rather focuses on exercises (such as the Pandas library). .. --We are using the data up to May 1, 2020. (* Lecture video is data up to 3/22 at the time of shooting)

# Exercise

Analyzing Coronavirus with Python (COVID-19) by NeuralNine on Youtube

Use the following in the linked dataset

• time_series_covid19_confirmed_global.csv
• time_series_covid19_deaths_global.csv
• time_series_covid19_recovered_global.csv

It is the data of [Infection (confirmation) number, number of deaths, number of recovery].

The name of the data is long, so change it as follows.

• time_series_covid19_confirmed_global.csv → covid_confirmed.csv
• time_series_covid19_deaths_global.csv → covid_deaths.csv
• time_series_covid19_recovered_global.csv → covid_recovered.csv

Import library

``````import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline
``````

``````confirmed = pd.read_csv("covid19_confirmed.csv")
``````

Display the infected person's data as a trial (* In the original video, it is the data of 3/22 at the time of shooting, but the following is until 5/1)

``````confirmed.head()
``````
Province/State Country/Region Lat Long 1/22/20 1/23/20 1/24/20 1/25/20 1/26/20 1/27/20 ... 4/22/20 4/23/20 4/24/20 4/25/20 4/26/20 4/27/20 4/28/20 4/29/20 4/30/20 5/1/20
0 NaN Afghanistan 33.0000 65.0000 0 0 0 0 0 0 ... 1176 1279 1351 1463 1531 1703 1828 1939 2171 2335
1 NaN Albania 41.1533 20.1683 0 0 0 0 0 0 ... 634 663 678 712 726 736 750 766 773 782
2 NaN Algeria 28.0339 1.6596 0 0 0 0 0 0 ... 2910 3007 3127 3256 3382 3517 3649 3848 4006 4154
3 NaN Andorra 42.5063 1.5218 0 0 0 0 0 0 ... 723 723 731 738 738 743 743 743 745 745
4 NaN Angola -11.2027 17.8739 0 0 0 0 0 0 ... 25 25 25 25 26 27 27 27 27 30

5 rows × 105 columns

• Latitude (Lat), Longitude (Long)

This time, I don't need Province and Lat / Long so much, so I'll delete each column.

``````confirmed = confirmed.drop(['Province/State','Lat','Long'],axis=1)
deaths = deaths.drop(['Province/State','Lat','Long'],axis=1)
recovered = recovered.drop(['Province/State','Lat','Long'],axis=1)
``````

Let's aggregate this data by Country / Region

``````confirmed = confirmed.groupby(confirmed["Country/Region"]).aggregate("sum")
deaths = deaths.groupby(deaths["Country/Region"]).aggregate("sum")
recovered = recovered.groupby(recovered["Country/Region"]).aggregate("sum")
``````
``````confirmed.head()
``````
1/22/20 1/23/20 1/24/20 1/25/20 1/26/20 1/27/20 1/28/20 1/29/20 1/30/20 1/31/20 ... 4/22/20 4/23/20 4/24/20 4/25/20 4/26/20 4/27/20 4/28/20 4/29/20 4/30/20 5/1/20
Country/Region
Afghanistan 0 0 0 0 0 0 0 0 0 0 ... 1176 1279 1351 1463 1531 1703 1828 1939 2171 2335
Albania 0 0 0 0 0 0 0 0 0 0 ... 634 663 678 712 726 736 750 766 773 782
Algeria 0 0 0 0 0 0 0 0 0 0 ... 2910 3007 3127 3256 3382 3517 3649 3848 4006 4154
Andorra 0 0 0 0 0 0 0 0 0 0 ... 723 723 731 738 738 743 743 743 745 745
Angola 0 0 0 0 0 0 0 0 0 0 ... 25 25 25 25 26 27 27 27 27 30

5 rows × 101 columns

Next, the date is the feature quantity, but this time we want to use the country as the feature quantity, so we will transpose the data (replace the matrix).

``````confirmed = confirmed.T
deaths = deaths.T
recovered = recovered.T
``````
``````confirmed.head()
``````
Country/Region Afghanistan Albania Algeria Andorra Angola Antigua and Barbuda Argentina Armenia Australia Austria ... United Kingdom Uruguay Uzbekistan Venezuela Vietnam West Bank and Gaza Western Sahara Yemen Zambia Zimbabwe
1/22/20 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
1/23/20 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 2 0 0 0 0 0
1/24/20 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 2 0 0 0 0 0
1/25/20 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 2 0 0 0 0 0
1/26/20 0 0 0 0 0 0 0 0 4 0 ... 0 0 0 0 2 0 0 0 0 0

5 rows × 187 columns

At this point, the data is ready. Let's move on to the calculation.

First, let's look at the changes in the number of infected people. The data required here is the difference in the number of infected people between the day and the day before.

``````new_cases = confirmed.copy()
``````
``````for day in range(1,len(confirmed)):
new_cases.iloc[day] = confirmed.iloc[day] - confirmed.iloc[day - 1]
``````

View the data for the last 10 days

``````new_cases.tail(10)
``````
Country/Region Afghanistan Albania Algeria Andorra Angola Antigua and Barbuda Argentina Armenia Australia Austria ... United Kingdom Uruguay Uzbekistan Venezuela Vietnam West Bank and Gaza Western Sahara Yemen Zambia Zimbabwe
4/22/20 84 25 99 6 1 1 113 72 7 52 ... 4466 8 38 3 0 8 0 0 4 0
4/23/20 103 29 97 0 0 0 291 50 10 77 ... 4608 14 42 23 0 6 0 0 2 0
4/24/20 72 15 120 8 0 0 172 73 15 69 ... 5394 6 46 7 2 4 0 0 8 1
4/25/20 112 34 129 7 0 0 173 81 17 77 ... 4929 33 58 5 0 -142 0 0 0 2
4/26/20 68 14 126 0 1 0 112 69 20 77 ... 4468 10 7 2 0 0 0 0 4 0
4/27/20 172 10 135 5 1 0 111 62 7 49 ... 4311 14 35 4 0 0 0 0 0 1
4/28/20 125 14 132 0 0 0 124 59 23 83 ... 4002 5 35 0 0 1 0 0 7 0
4/29/20 111 16 199 0 0 0 158 65 8 45 ... 4091 5 63 2 0 1 0 5 2 0
4/30/20 232 7 158 2 0 0 143 134 14 50 ... 6040 13 37 2 0 0 0 0 9 8
5/1/20 164 9 148 0 3 1 104 82 12 79 ... 6204 5 47 2 0 9 0 1 3 0

10 rows × 187 columns

Let's compare with infected person data

``````confirmed.tail(10)
``````
Country/Region Afghanistan Albania Algeria Andorra Angola Antigua and Barbuda Argentina Armenia Australia Austria ... United Kingdom Uruguay Uzbekistan Venezuela Vietnam West Bank and Gaza Western Sahara Yemen Zambia Zimbabwe
4/22/20 1176 634 2910 723 25 24 3144 1473 6652 14925 ... 134638 543 1716 288 268 474 6 1 74 28
4/23/20 1279 663 3007 723 25 24 3435 1523 6662 15002 ... 139246 557 1758 311 268 480 6 1 76 28
4/24/20 1351 678 3127 731 25 24 3607 1596 6677 15071 ... 144640 563 1804 318 270 484 6 1 84 29
4/25/20 1463 712 3256 738 25 24 3780 1677 6694 15148 ... 149569 596 1862 323 270 342 6 1 84 31
4/26/20 1531 726 3382 738 26 24 3892 1746 6714 15225 ... 154037 606 1869 325 270 342 6 1 88 31
4/27/20 1703 736 3517 743 27 24 4003 1808 6721 15274 ... 158348 620 1904 329 270 342 6 1 88 32
4/28/20 1828 750 3649 743 27 24 4127 1867 6744 15357 ... 162350 625 1939 329 270 343 6 1 95 32
4/29/20 1939 766 3848 743 27 24 4285 1932 6752 15402 ... 166441 630 2002 331 270 344 6 6 97 32
4/30/20 2171 773 4006 745 27 24 4428 2066 6766 15452 ... 172481 643 2039 333 270 344 6 6 106 40
5/1/20 2335 782 4154 745 30 25 4532 2148 6778 15531 ... 178685 648 2086 335 270 353 6 7 109 40

10 rows × 187 columns

For example, Afghanistan, Algeria, Argentina and the United Kingdom have a large number of infected people, but the number of newly infected people is still high.

In new_cases, we looked at the daily "increases" in the number of infected people, but next let's look at the "increase rate". (Increase in the day / Number of infected people in the previous day) * 100 can be used to increase the rate.

``````growth_rate = confirmed.copy()

for day in range(1,len(growth_rate)):
growth_rate.iloc[day] = ( new_cases.iloc[day] / confirmed.iloc[day-1] ) * 100
``````
``````growth_rate.tail(10)
``````
Country/Region Afghanistan Albania Algeria Andorra Angola Antigua and Barbuda Argentina Armenia Australia Austria ... United Kingdom Uruguay Uzbekistan Venezuela Vietnam West Bank and Gaza Western Sahara Yemen Zambia Zimbabwe
4/22/20 7.692308 4.105090 3.521878 0.836820 4.166667 4.347826 3.728143 5.139186 0.105342 0.349627 ... 3.430845 1.495327 2.264601 1.052632 0.000000 1.716738 0.0 0.000000 5.714286 0.000000
4/23/20 8.758503 4.574132 3.333333 0.000000 0.000000 0.000000 9.255725 3.394433 0.150331 0.515913 ... 3.422511 2.578269 2.447552 7.986111 0.000000 1.265823 0.0 0.000000 2.702703 0.000000
4/24/20 5.629398 2.262443 3.990688 1.106501 0.000000 0.000000 5.007278 4.793171 0.225158 0.459939 ... 3.873720 1.077199 2.616610 2.250804 0.746269 0.833333 0.0 0.000000 10.526316 3.571429
4/25/20 8.290155 5.014749 4.125360 0.957592 0.000000 0.000000 4.796230 5.075188 0.254605 0.510915 ... 3.407771 5.861456 3.215078 1.572327 0.000000 -29.338843 0.0 0.000000 0.000000 6.896552
4/26/20 4.647984 1.966292 3.869779 0.000000 4.000000 0.000000 2.962963 4.114490 0.298775 0.508318 ... 2.987250 1.677852 0.375940 0.619195 0.000000 0.000000 0.0 0.000000 4.761905 0.000000
4/27/20 11.234487 1.377410 3.991721 0.677507 3.846154 0.000000 2.852004 3.550974 0.104260 0.321839 ... 2.798678 2.310231 1.872659 1.230769 0.000000 0.000000 0.0 0.000000 0.000000 3.225806
4/28/20 7.339988 1.902174 3.753199 0.000000 0.000000 0.000000 3.097677 3.263274 0.342211 0.543407 ... 2.527345 0.806452 1.838235 0.000000 0.000000 0.292398 0.0 0.000000 7.954545 0.000000
4/29/20 6.072210 2.133333 5.453549 0.000000 0.000000 0.000000 3.828447 3.481521 0.118624 0.293026 ... 2.519864 0.800000 3.249097 0.607903 0.000000 0.291545 0.0 500.000000 2.105263 0.000000
4/30/20 11.964930 0.913838 4.106029 0.269179 0.000000 0.000000 3.337223 6.935818 0.207346 0.324633 ... 3.628914 2.063492 1.848152 0.604230 0.000000 0.000000 0.0 0.000000 9.278351 25.000000
5/1/20 7.554123 1.164295 3.694458 0.000000 11.111111 4.166667 2.348690 3.969022 0.177357 0.511261 ... 3.596918 0.777605 2.305051 0.600601 0.000000 2.616279 0.0 16.666667 2.830189 0.000000

10 rows × 187 columns

By the way, the number of infected people (confirmed) is the so-called cumulative number, so let's get the current progressive number of infected people (Active) here. By subtracting [deaths and recovered] from [confirmed], it seems that [currently progressive number of infected people (Active)] can be calculated.

``````active_cases = confirmed.copy()

for day in range(0,len(confirmed)):
active_cases.iloc[day] = confirmed.iloc[day] - deaths.iloc[day] - recovered.iloc[day]
``````

Then, let's use the data of this currently progressive number of infected people active_cases to investigate the rate of increase in the number of people with ongoing infections again. By examining this, it seems that we can see if it is likely to converge.

``````overall_growth_rate = confirmed.copy()

for day in range(0,len(confirmed)):
overall_growth_rate.iloc[day] = ((active_cases.iloc[day] - active_cases.iloc[day-1]) / active_cases.iloc[day-1]) * 100
``````
``````overall_growth_rate.tail(10)
``````
Country/Region Afghanistan Albania Algeria Andorra Angola Antigua and Barbuda Argentina Armenia Australia Austria ... United Kingdom Uruguay Uzbekistan Venezuela Vietnam West Bank and Gaza Western Sahara Yemen Zambia Zimbabwe
4/22/20 7.064018 5.462185 2.920284 -5.276382 6.250000 -15.384615 3.718200 6.250000 -12.214551 -9.498681 ... 3.270797 -1.895735 -4.258555 -1.265823 -13.461538 2.046036 0.000000 0.000000 12.500000 -4.347826
4/23/20 9.072165 0.000000 -4.524540 -6.366048 0.000000 0.000000 10.896226 2.941176 -6.836056 -9.750567 ... 3.411790 -7.729469 -5.480540 12.179487 -2.222222 -3.759398 -83.333333 0.000000 0.000000 0.000000
4/24/20 5.860113 2.390438 4.738956 -1.699717 0.000000 -9.090909 4.423650 0.119048 -5.064935 -4.199569 ... 3.743980 -4.712042 -1.260504 0.571429 13.636364 1.041667 0.000000 -100.000000 22.222222 4.545455
4/25/20 9.642857 9.727626 4.141104 2.017291 0.000000 0.000000 4.480652 0.594530 -15.321477 -5.994755 ... 3.332975 16.483516 -2.382979 2.840909 -10.000000 -36.082474 0.000000 NaN 0.000000 8.695652
4/26/20 3.745928 2.127660 6.701031 0.000000 5.882353 0.000000 1.091618 4.609929 -11.954766 -4.304504 ... 3.232666 1.886792 -6.538797 -1.657459 0.000000 3.629032 0.000000 NaN -2.272727 0.000000
4/27/20 11.930926 -0.694444 5.383023 -10.169492 5.555556 0.000000 2.815272 5.197740 -3.669725 -1.582674 ... 3.051680 1.388889 -6.343284 -0.561798 0.000000 0.000000 0.000000 NaN 0.000000 -8.000000
4/28/20 8.134642 1.048951 2.226588 -4.402516 0.000000 0.000000 3.450863 4.296455 -5.714286 -6.559458 ... 2.318102 -1.369863 -6.474104 0.000000 6.666667 5.058366 0.000000 NaN 16.279070 0.000000
4/29/20 5.512322 -2.768166 9.032671 -8.552632 -5.263158 0.000000 4.387237 3.192585 -4.444444 -7.472826 ... 2.386758 -6.018519 -4.472843 1.129944 0.000000 0.370370 0.000000 inf -20.000000 0.000000
4/30/20 13.521819 -3.202847 4.406580 -15.467626 0.000000 0.000000 2.605071 10.279441 -1.585624 -4.013705 ... 3.845988 2.955665 0.000000 -2.234637 6.250000 -1.845018 0.000000 -40.000000 20.000000 34.782609
5/1/20 5.955604 -3.308824 5.796286 -0.425532 -5.555556 -30.000000 2.064997 2.986425 -2.255639 -6.578276 ... 3.750518 -6.220096 -3.567447 1.142857 0.000000 3.383459 0.000000 33.333333 -33.333333 0.000000

10 rows × 187 columns

## Increasing rate of ongoing infections on the last 10 days (2020/04 / 22-05 / 01) (China, Italy, USA, Japan)

• A little bit, the original is added to this part.

First of all, China, which is considered to be a corona-affected country, seems to have recently converged, so let's take a look at the data of the last 10 days of China.

``````overall_growth_rate['China'].tail(10)
``````
``````4/22/20   -3.314528
4/23/20   -7.731583
4/24/20   -8.774704
4/25/20   -4.852686
4/26/20   -9.107468
4/27/20   -9.118236
4/28/20   -2.866593
4/29/20   -5.448354
4/30/20   -4.441777
5/1/20    -5.904523
Name: China, dtype: float64
``````

It can be seen that the rate of increase is negative, that is, the number of people infected with the progressive tense is declining.

``````overall_growth_rate['Italy'].tail(10)
``````
``````4/22/20   -0.009284
4/23/20   -0.790165
4/24/20   -0.300427
4/25/20   -0.638336
4/26/20    0.241859
4/27/20   -0.273319
4/28/20   -0.574599
4/29/20   -0.520888
4/30/20   -2.967790
5/1/20    -0.598714
Name: Italy, dtype: float64
``````

The rate of decrease is less than 1%, so it is a slight decrease, but it has not increased.

``````overall_growth_rate['US'].tail(10)
``````
``````4/22/20    3.470050
4/23/20    3.307839
4/24/20    2.102556
4/25/20    3.874078
4/26/20    2.536775
4/27/20    2.064644
4/28/20    2.166569
4/29/20    2.377575
4/30/20   -0.668941
5/1/20     2.583283
Name: US, dtype: float64
``````

America is still increasing by a few percent.

``````overall_growth_rate['Japan'].tail(10)
``````
``````4/22/20    2.512198
4/23/20    6.794937
4/24/20    3.868765
4/25/20    2.382691
4/26/20    0.401248
4/27/20    5.408526
4/28/20   -3.589182
4/29/20   -2.875120
4/30/20    0.755803
5/1/20    -2.884444
Name: Japan, dtype: float64
``````

In Japan as well, it seems that it has been decreasing little by little recently, but it is on the increase.

Japan is a big deal, so let's take a look at the average rate of increase over the last 10 days.

``````overall_growth_rate['Japan'].tail(10).mean()
``````
``````1.277542288600591
``````

It's about 1%, but it seems to be increasing. Well, the number of infected people is still small in Japan, so it can be said that it is relatively suppressed.

From here, we will add visualization.

Let's look at the mortality rate first. Mortality is important because it is an indicator of the severity of the corona in each region.

First, the mortality data frame is similar to the previous procedure. Mortality is expressed in terms of death / infected.

``````death_rate = confirmed.copy()
``````
``````for day in range(0,len(confirmed)):
death_rate.iloc[day] = (deaths.iloc[day] / confirmed.iloc[day]) * 100
``````

Next, calculate the number of beds you need (and, conversely, how likely you are to run out). We use the hospitalization rate "hospitalization", which is the percentage of infected people who need a hospital. I don't know the correct number, so I will use a temporary number (0.05 in this case). You can change it to any number you like. In this lecture, we will focus on analysis and calculation methods, so we will leave the accuracy aside.

• By the way, the hospitalization rate is for those who are positive for corona and need to be hospitalized, and even if the remaining 95% are positive, we consider that hospitalization (bed) is not necessary.
``````hospitalization_rate_estimate = 0.05
``````
``````hospitalization_needed = confirmed.copy()
``````
``````for day in range(0,len(confirmed)):
hospitalization_needed.iloc[day] = active_cases.iloc[day] * hospitalization_rate_estimate
``````
``````hospitalization_needed.tail()
``````
Country/Region Afghanistan Albania Algeria Andorra Angola Antigua and Barbuda Argentina Armenia Australia Austria ... United Kingdom Uruguay Uzbekistan Venezuela Vietnam West Bank and Gaza Western Sahara Yemen Zambia Zimbabwe
4/27/20 71.30 14.30 76.35 15.90 0.95 0.50 133.30 46.55 52.50 118.15 ... 6654.15 10.95 50.20 8.85 2.25 12.85 0.05 0.00 2.15 1.15
4/28/20 77.10 14.45 78.05 15.20 0.95 0.50 137.90 48.55 49.50 110.40 ... 6808.40 10.80 46.95 8.85 2.40 13.50 0.05 0.00 2.50 1.15
4/29/20 81.35 14.05 85.10 13.90 0.90 0.50 143.95 50.10 47.30 102.15 ... 6970.90 10.15 44.85 8.95 2.40 13.55 0.05 0.25 2.00 1.15
4/30/20 92.35 13.60 88.85 11.75 0.90 0.50 147.70 55.25 46.55 98.05 ... 7239.00 10.45 44.85 8.75 2.55 13.30 0.05 0.15 2.40 1.55
5/1/20 97.85 13.15 94.00 11.70 0.85 0.35 150.75 56.90 45.50 91.60 ... 7510.50 9.80 43.25 8.85 2.55 13.75 0.05 0.20 1.60 1.55

5 rows × 187 columns

Even if you take out one country here, it is difficult to understand how serious it is, so let's look at the average of the last 5 days.

``````hospitalization_needed.tail().mean().mean()
``````
``````532.5691978609626
``````

Average number of beds required in all countries over the last 5 days. Of course there are variations, so it is not a very ideal reference value, but I will refer to this once. Let's take a look at the average of the last 5 days in Italy.

``````hospitalization_needed['Italy'].tail().mean()
``````
``````5181.6900000000005
``````

In other words, the figure was roughly 10 times the average in the world. It's pretty serious.

Visualize. However, there are too many countries, so let's choose some arbitrary countries this time. Here, select Italy, USA, China, Japan, Russia Spain.

``````countries = ['Italy','US',"China","Japan","Russia","Spain"]
``````
``````ax = plt.subplot()
ax.set_facecolor("black")
ax.figure.set_facecolor("#121212")
ax.tick_params(axis="x",colors="white")
ax.tick_params(axis="y",colors="white")
ax.set_title("covid-19 confirmed by countries",color="white")

for country in countries:
confirmed[country].plot(label=country)
plt.legend(loc="upper left")
plt.show()
``````

The number of infected people in the US has increased significantly since the end of March. Let's see the number of deaths.

The shape of the graph does not change much, and it seems to be associated with the number of infected people.

Next, let's plot the rate of increase in infected people.

But now we're plotting on a bar chart. Also, in a bar chart, if the graph overlaps too much on one figure, the visibility will be low, so it will be displayed separately.

``````for country in countries:
ax = plt.subplot()
ax.set_facecolor("black")
ax.figure.set_facecolor("#121212")
ax.tick_params(axis="x",colors="white")
ax.tick_params(axis="y",colors="white")
ax.set_title(f"covid-19 confirmed growth rate {country}",color="white")
growth_rate[country].plot.bar()
plt.show()
``````

In the same way, let's look at the number of deaths and the mortality rate (* The above is the "infection rate increase rate" and this is the "mortality rate").

``````ax = plt.subplot()
ax.set_facecolor("black")
ax.figure.set_facecolor("#121212")
ax.tick_params(axis="x",colors="white")
ax.tick_params(axis="y",colors="white")
ax.set_title("covid-19 deaths by countries",color="white")

for country in countries:
deaths[country].plot(label=country)
plt.legend(loc="upper left")
plt.show()
``````

``````for country in countries:
ax = plt.subplot()
ax.set_facecolor("black")
ax.figure.set_facecolor("#121212")
ax.tick_params(axis="x",colors="white")
ax.tick_params(axis="y",colors="white")
ax.set_title(f"covid-19 deaths rate {country}",color="white")
death_rate[country].plot.bar()
plt.show()
``````

You can see that the mortality rate varies by country.

Finally, let's move on to simulating the effects of the coronavirus in the future. As a tentative value, let's assume that the number of infected people increases by 1% on a daily basis.

``````simulated_growth_rate = 0.01
``````

Now add the upcoming new date data for your forecast. Specify the range and use the date_range method that can generate date data. The last data used this time is 05/01/20, so it will be 40 days from the next day.

``````dates = pd.date_range(start="05/02/2020",periods=40,freq='D')
dates = pd.Series(dates)
dates = dates.dt.strftime("%m/%d/%Y")
``````
``````simulated = confirmed.copy()
simulated = simulated.append(pd.DataFrame(index=dates))

for day in range(len(confirmed),len(confirmed)+40):
simulated.iloc[day] = simulated.iloc[day-1] * (1 + simulated_growth_rate)
ax = plt.subplot()
ax.set_facecolor("black")
ax.figure.set_facecolor("#121212")
ax.tick_params(axis="x",colors="white")
ax.tick_params(axis="y",colors="white")
ax.set_title(f"covid-19 future for Japan",color="white")
simulated['Japan'].plot()
plt.show()
``````

• As a reminder, this is a number based on a tentative growth rate (* it continues to increase by 1% daily). As a rigorous simulation, you don't have to take it. that's all. There are some differences from the original video, but I think I got a rough idea of the flow of data analysis. Please take a look at the original video. At the end of the video, the poster, Neural Nine, emphasized that this analysis has tentative numbers, so you don't have to take it seriously. He told me that what I should do is important.