[PYTHON] Released corona occurrence status graph by municipality on streamlit and heroku

myself

It is an accounting shop. It's been three years since I started python as a hobby at the age of 35. Recently, I've gradually become able to create web applications, which is a lot of fun. Thank you. I'm still immature, but I thought it would be useful for someone, so I wrote an article.

Getting started

Until recently, the number of new coronas in Nagano Prefecture was low, but recently it has increased considerably. When I go out, I'm a little worried about where to go. At that time, I found that CSV data of the corona occurrence situation was published on the homepage of Nagano prefecture, so I made a site that can be seen in the graph (now). Click here for the finished product → covid 19 in nagano

Overall image

Screen Shot 2021-01-20 at 9.13.14.png

procedure

table of contents

  1. Data confirmation
  2. Create an application with streamlit
  3. Up to heroku
  4. Settings to wake up heroku with GAS

Extra: Continuous management

1. Data confirmation

The link below shows the status of corona outbreaks in Nagano prefecture, so you can easily check the contents on google colaboratory.

https://www.pref.nagano.lg.jp/hoken-shippei/kenko/kenko/kansensho/joho/documents/200000_nagano_covid19_patients.csv

This time, we will finally create an app with streamlit, so the python code used for data analysis will also be used for the file for streamlit.

python


import pandas as pd

python


df = pd.read_csv('https://www.pref.nagano.lg.jp/hoken-shippei/kenko/kenko/kansensho/joho/documents/200000_nagano_covid19_patients.csv', encoding='cp932', header=1)
df.head()

Screen Shot 2021-01-20 at 9.01.27.png

There are various items, but narrow down to only the necessary data. Change the item name to something more descriptive. Then convert the date data type to datetime.

python


df.columns

python


df = df[['No', 'Case confirmation_date', 'patient_residence', 'Remarks']]
df.rename(columns={'Case confirmation_date': 'date', 'patient_residence': 'Municipalities'}, inplace=True)
df['date'] = pd.to_datetime(df['date'])
df.head(10)

Screen Shot 2021-01-20 at 9.25.05.png

If you live in Tokyo and have an onset in Nagano, your place of residence is Tokyo and the name of the municipality in the prefecture is listed in the remarks, so for those people, convert the municipality to the character string after "Homecoming:" I will.

python


def change_location(x):
    if str(x['Remarks'])[:3] == 'Homecoming destination':
        x['Municipalities'] = str(x['Remarks']).replace('Homecoming:', '')
    return x
df = df.apply(change_location, axis=1)
df.head(10)

Screen Shot 2021-01-20 at 9.29.00.png If you look at the converted cities, towns and villages, there are some that are not yet cities, towns and villages, so it can't be helped, so make a list manually and extract only those with cities, towns and villages. Others are converted to "Other". The names of the cities, towns and villages are also'Minamiminowa Village',' Kamiina District Minamiminowa Village', and'Kamiina District \ r \ nMinamiminowa Village', so we will unify them into one. Also, some cities, towns and villages are not included in this list, so it is necessary to occasionally check if there are any new cities, towns and villages and update the list. This area is a little difficult. If you have a better idea, please let me know.

python


df['Municipalities'].unique()

python


towns = ['Nagano city', 'Yamanouchi Town',
       'Ueda City', 'Matsumoto', 'Chikuhoku Village', 'Azumino City', 'Sakuho Town', 'Suwa City', 'Suzaka City', 'Minamiminowa Village',
       'Komoro City', 'Iida City', 'Nakano City', 'Karuizawa Town', 'Miyota Town', 'Sakaki Town', 'Omachi City', 'Okaya City',
       'Ikusaka Village', 'Saku City', 'Tomi City', 'Chikuma City', 'Nagawa Town', 'Chino City', 'Aoki Village',
       'Haramura', 'Iiyama City', 'Shinano Town', 'Fujimi Town', 'Shimosuwa Town', 'Ina City', 'Sakae Village',
       'Kijimadaira Village', 'Obuse Town', 'Tateshina Town', 'Miyada village', 'Shiojiri City', 'Minamiminowa Village, Kamiina District',
       'Kawakami Village, Minamisaku District', 'Komagane City', 'Nozawa Onsen Village', 'Kiso Town', 'Iizuna Town', 'Iijima Town', 'Tatsuno Town',
       'Nagiso Town', 'Hakuba Village', 'Takayama Village', 'Minowa Town', 'Otari Village', 'Agematsu Town',
       'Tenryu Village', 'Takamori Town', 'Nakagawa Village', 'Asahi Village', 'Yamagata Village', 'Ikeda Town', 'Shimojo Village',
       'Miyota Town, Kitasaku District', 'Kamiina-gun\r\n Minamiminowa Village', 'Shimotakai District\r\n Nozawa Onsen Village', 'Anan Town',
       'Komagane City', 'Ogawa Village', 'Takagi Village', 'Matsukawa Town']

def change_towns(x):
    if x['Municipalities'] not in towns:
        x['Municipalities'] = 'Other'
    return x
df = df.apply(change_towns, axis=1)

df['Municipalities'] = df['Municipalities'].str.replace('Minamiminowa Village, Kamiina District', 'Minamiminowa Village')
df['Municipalities'] = df['Municipalities'].str.replace('Kamiina-gun\r\n Minamiminowa Village', 'Minamiminowa Village')
df['Municipalities'] = df['Municipalities'].str.replace('Shimotakai District\r\n Nozawa Onsen Village', 'Nozawa Onsen Village')
df['Municipalities'] = df['Municipalities'].str.replace('Kawakami Village, Minamisaku District', 'Kawakami Village')
df['Municipalities'] = df['Municipalities'].str.replace('Miyota Town, Kitasaku District', 'Miyota Town')
df['Municipalities'].unique()

Screen Shot 2021-01-20 at 9.34.00.png Screen Shot 2021-01-20 at 9.34.25.png When pivot_table was performed, the number of occurrences by municipality was successfully totaled.

python


pivot_daily = df.pivot_table(index='date', columns='Municipalities', values='No', aggfunc=len, dropna=False).fillna(0)
pivot_daily.tail()

Screen Shot 2021-01-20 at 9.39.41.png The cumulative total is cumsum.

python


pivot_daily_cum = df.pivot_table(index='date', columns='Municipalities', values='No', aggfunc=len, dropna=False).fillna(0).cumsum()
pivot_daily_cum.tail()

Screen Shot 2021-01-20 at 9.42.36.png

2. Create application with streamlit

Now that we have the data, we can use streamlit to create the actual application. Install streamlit and create a python file. There are many people on the net who explain how to use streamlit in an easy-to-understand manner, so I'll leave it to you. Also, streamlit has a very rich official tutorial, so you can feel like you understand it just by going through it. It's okay.

view.py


import streamlit as st
import plotly.express as px
import pandas as pd
import datetime

df = pd.read_csv('https://www.pref.nagano.lg.jp/hoken-shippei/kenko/kenko/kansensho/joho/documents/200000_nagano_covid19_patients.csv', encoding='cp932', header=1)
df = df[['No', 'Case confirmation_date', 'patient_residence', 'Remarks']]

#Change item name, change date data type
df.rename(columns={'Case confirmation_date':'date', 'patient_residence':'Municipalities'}, inplace=True)
df['date'] = pd.to_datetime(df['date'])

#Data with "Homecoming destination:" in the remarks converts the municipality
def change_location(x):
    if str(x['Remarks'])[:3] == 'Homecoming destination':
        x['Municipalities'] = str(x['Remarks']).replace('Homecoming:', '')
    return x

df = df.apply(change_location, axis=1)

#Convert if there is anything other than the city name in the city column
towns = ['Nagano city', 'Yamanouchi Town',
       'Ueda City', 'Matsumoto', 'Chikuhoku Village', 'Azumino City', 'Sakuho Town', 'Suwa City', 'Suzaka City', 'Minamiminowa Village',
       'Komoro City', 'Iida City', 'Nakano City', 'Karuizawa Town', 'Miyota Town', 'Sakaki Town', 'Omachi City', 'Okaya City',
       'Ikusaka Village', 'Saku City', 'Tomi City', 'Chikuma City', 'Nagawa Town', 'Chino City', 'Aoki Village',
       'Haramura', 'Iiyama City', 'Shinano Town', 'Fujimi Town', 'Shimosuwa Town', 'Ina City', 'Sakae Village',
       'Kijimadaira Village', 'Obuse Town', 'Tateshina Town', 'Miyada village', 'Shiojiri City', 'Minamiminowa Village, Kamiina District',
       'Kawakami Village, Minamisaku District', 'Komagane City', 'Nozawa Onsen Village', 'Kiso Town', 'Iizuna Town', 'Iijima Town', 'Tatsuno Town',
       'Nagiso Town', 'Hakuba Village', 'Takayama Village', 'Minowa Town', 'Otari Village', 'Agematsu Town',
       'Tenryu Village', 'Takamori Town', 'Nakagawa Village', 'Asahi Village', 'Yamagata Village', 'Ikeda Town', 'Shimojo Village',
       'Miyota Town, Kitasaku District', 'Kamiina-gun\r\n Minamiminowa Village', 'Shimotakai District\r\n Nozawa Onsen Village', 'Anan Town',
       'Komagane City', 'Ogawa Village', 'Takagi Village', 'Matsukawa Town']

def change_towns(x):
    if x['Municipalities'] not in towns:
        x['Municipalities'] = 'Other'
    return x
df = df.apply(change_towns, axis=1)

df['Municipalities'] = df['Municipalities'].str.replace('Minamiminowa Village, Kamiina District', 'Minamiminowa Village')
df['Municipalities'] = df['Municipalities'].str.replace('Kamiina-gun\r\n Minamiminowa Village', 'Minamiminowa Village')
df['Municipalities'] = df['Municipalities'].str.replace('Shimotakai District\r\n Nozawa Onsen Village', 'Nozawa Onsen Village')
df['Municipalities'] = df['Municipalities'].str.replace('Kawakami Village, Minamisaku District', 'Kawakami Village')
df['Municipalities'] = df['Municipalities'].str.replace('Miyota Town, Kitasaku District', 'Miyota Town')

# sidemenu
st.sidebar.markdown(
    '# Covid-19 in Nagano'
)
town_selected = st.sidebar.selectbox(
    "Municipalities", list(df['Municipalities'].unique()), 1 #By default, list number 1'Nagano city'Show
)
st.sidebar.markdown(
    '"Others" includes those that are not municipal names, such as "Matsumoto Health Center jurisdiction" and "Tokyo".'
)

today = datetime.date.today()
start_date = st.sidebar.date_input('start date', df['date'].min())
end_date = st.sidebar.date_input('End date', today)
if start_date < end_date:
    st.sidebar.success('OK')
else:
    st.sidebar.error('Error:The end date should be after the start date.')

df = df[df['date'].between(pd.to_datetime(start_date), pd.to_datetime(end_date))]

# body
#Number of occurrences per day graph
st.markdown(
    '#Number of occurrences by municipality (by date)'
)
st.markdown(
    'You can enlarge it by dragging the chart.'
)
pivot_daily = df.pivot_table(index='date', columns='Municipalities', values='No', aggfunc=len, dropna=False).fillna(0)
st.write(
    px.bar(pivot_daily, x=pivot_daily.index, y=town_selected)
)
#Graph of cumulative number of occurrences
st.markdown(
    '#Number of occurrences by municipality (cumulative)'
)
pivot_daily_cum = df.pivot_table(index='date', columns='Municipalities', values='No', aggfunc=len, dropna=False).fillna(0).cumsum()
st.write(
    px.area(pivot_daily_cum, x=pivot_daily_cum.index, y=town_selected)
)

#Comparison of all municipalities with cumulative number of occurrences by municipality
st.markdown(
    '#Number of occurrences by municipality (cumulative total),Comparison of all municipalities)'
)
data_span = st.radio(
    "Aggregation period",
    ('Last 30 days', 'Whole period')
)
if data_span == 'Last 30 days':
    df = df[df['date'] >= str(today - datetime.timedelta(days=30))]

town_cum = pd.DataFrame(df.groupby('Municipalities')['No'].count().sort_values())
town_cum.rename(columns={'No':'Cumulative number of occurrences'}, inplace=True)

st.write(
    px.bar(town_cum, x='Cumulative number of occurrences', y=town_cum.index, orientation='h', height=1500, hover_data=['Cumulative number of occurrences', town_cum.index])
)

If you cd to the directory with view.py in Terminal and run it, the application will start.

python


$ streamlit run view.py

3. Up to heroku

I didn't know how to upload to heroku at all, so I referred to the following site. → [Easy detonation velocity 2nd] Deploy Streamlit to heroku I was shown various other sites, but I don't remember. Excuse me.

I'm not sure, but it worked when I followed the steps below.

・ Prepare necessary directories and files on your PC Screen Shot 2021-01-20 at 10.25.48.png

Porocfile


web: sh setup.sh && streamlit run view.py

requirements.txt


streamlit==0.74.1
plotly==4.14.3
pandas

setup.sh


mkdir -p ~/.streamlit/

echo "\
[general]\n\
email = \"[email protected]\"\n\
" > ~/.streamlit/credentials.toml

echo "\
[server]\n\
headless = true\n\
enableCORS=false\n\
port = $PORT\n\
" > ~/.streamlit/config.toml

・ Create an account on heroku ・ Launch Terminal and execute the following

$ heroku login
$ heroku create appname
$ git init
$ heroku git:remote -a sample
$ heroku buildpacks:set heroku/python
$ git add .
$ git commit -m "1st commit"
$ git push heroku master
$ heroku open

If all goes well, you should be able to see the CSV graph page. I'm sorry if I make a mistake.

4. Setting to wake up heroku with GAS

Heroku will automatically go to sleep if nothing is done for 30 minutes, and it seems that it will take time at the next startup. So I visited heroku on a regular basis to prevent heroku from sleeping. There seem to be many ways to do it, but I chose GAS because I decided not to register credit with heroku. When I searched for "GAS heroku", I found many sites that explained very carefully. Thank you.

Continuous management

As I mentioned a little in the data analysis, this project uses a list of cities, towns and villages (towns) by hand. Therefore, if the names of cities, towns and villages that are not included in towns are entered in CSV, they will be omitted from the graph tabulation. Therefore it will become necessary absolutely sure of whether or not there is leakage of the city name manually. It's annoying. In the case of 1/20, Matsukawa Town was newly introduced.

python


import pandas as pd
towns = ['Nagano city', 'Yamanouchi Town',
       'Ueda City', 'Matsumoto', 'Chikuhoku Village', 'Azumino City', 'Sakuho Town', 'Suwa City', 'Suzaka City', 'Minamiminowa Village',
       'Komoro City', 'Iida City', 'Nakano City', 'Karuizawa Town', 'Miyota Town', 'Sakaki Town', 'Omachi City', 'Okaya City',
       'Ikusaka Village', 'Saku City', 'Tomi City', 'Chikuma City', 'Nagawa Town', 'Chino City', 'Aoki Village',
       'Haramura', 'Iiyama City', 'Shinano Town', 'Fujimi Town', 'Shimosuwa Town', 'Ina City', 'Sakae Village',
       'Kijimadaira Village', 'Obuse Town', 'Tateshina Town', 'Miyada village', 'Shiojiri City', 'Minamiminowa Village, Kamiina District',
       'Kawakami Village, Minamisaku District', 'Komagane City', 'Nozawa Onsen Village', 'Kiso Town', 'Iizuna Town', 'Iijima Town', 'Tatsuno Town',
       'Nagiso Town', 'Hakuba Village', 'Takayama Village', 'Minowa Town', 'Otari Village', 'Agematsu Town',
       'Tenryu Village', 'Takamori Town', 'Nakagawa Village', 'Asahi Village', 'Yamagata Village', 'Ikeda Town', 'Shimojo Village',
       'Miyota Town, Kitasaku District', 'Kamiina-gun\r\n Minamiminowa Village', 'Shimotakai District\r\n Nozawa Onsen Village', 'Anan Town',
       'Komagane City', 'Ogawa Village', 'Takagi Village']

df = pd.read_csv('https://www.pref.nagano.lg.jp/hoken-shippei/kenko/kenko/kansensho/joho/documents/200000_nagano_covid19_patients.csv', encoding='cp932', header=1)
df[~df['patient_residence'].isin(towns)]['patient_residence'].unique()

Screen Shot 2021-01-20 at 20.43.51.png Visually check for leaks   ↓ Edit view.py at hand and push to heroku Or is it edited directly on heroku? I am doing the former.

At the end

This time I am doing it with CSV data of Nagano prefecture, but if there is something that publishes similar data (and is automatically updated every day) in other prefectures, this procedure can be applied immediately. think. It's not a big code, but if there is a reference part, I hope you can make a graph of other prefectures. And I hope it will be more convenient in the world than it is now.

Also, since I've been studying on my own, I would appreciate it if you could tell me if there is a better way or if you should write the code here like this. Thank you. the end.

Recommended Posts

Released corona occurrence status graph by municipality on streamlit and heroku