[PYTHON] Plotly Dash on Google Colab

Dash (https://dash.plotly.com/) is a framework for data visualization in python.

I thought this would be available on Google Colab, but I found a good article that has already been explained. https://qiita.com/OgawaHideyuki/items/725f4ffd93ffb0d30b6c

So, this article is a record that I used it as a reference and moved my hand. The theme is to easily visualize the number of people infected with corona in each country. Let's display a map and a time series graph.

** 2020-12-27 Addendum: ** Added map display.

Environment and resources to use

Use Google Colaboratory.

Various resources
This notebook

Since the map scatter plot display uses Mapbox tokens and Google drive, the notebook is divided into two parts, chronological and map.

Time series graph display

Preparation and data

First, install the package for using Dash from your Google Colab/Jupyter notebook.

! pip install jupyter_dash
! pip install --upgrade plotly

Import the packages associated with Dash.

import dash 
from jupyter_dash import JupyterDash 
import dash_core_components as dcc 
import dash_html_components as html 
import plotly.express as px
from dash.dependencies import Input, Output

Get corona infection data from GitHub. For the data, please refer to the following page. https://dodotechno.com/covd-19-visualization/

! wget https://github.com/CSSEGISandData/COVID-19/raw/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv

Load the downloaded csv into the data frame.

import pandas as pd
df = pd.read_csv("time_series_covid19_confirmed_global.csv")

Aggregate by region, delete latitude and longitude, transpose to column by country, and add date column.

df = df.groupby(['Country/Region'], as_index=False).sum()
df.drop(["Lat","Long"], axis=1,inplace=True)
df = df.T
df.columns =  df.iloc[0]    
df = df[1:]
df.reset_index(inplace=True)
df.rename(columns={'index': 'date'},inplace=True)
df

The resulting table looks like this:

image.png

Try to display it in a notebook

First, let's graph the number of infected people in Japan. The horizontal axis is the date and the vertical axis is the number of infected people, and you can see an increase that seems to be the first wave (late April), the second wave (early August), and the third wave (November).

px.line(df, x="date", y="Japan")

image.png

Next, let's make any country selectable in the dropdown. You can select it on your notebook, so give it a try. If the runtime is stopped, try selecting "Runtime"-> "Run All Cells" from the menu. https://colab.research.google.com/drive/1fUP4818fSsFFFlUHlLGNoTxq8uoL2VAu#scrollTo=Kr-FsvLIpCoN&line=1&uniqifier=1

app = JupyterDash(__name__)

app.layout = html.Div([
  dcc.Dropdown(id="my_dropdown",
    options=[{"value": country, "label": country} for country in df.columns.unique()],
    value=["Japan"],
    multi=True
    ),
  dcc.Graph(id="my_graph")
])

@app.callback(Output("my_graph", "figure"), Input("my_dropdown", "value"))
def update_graph(selected_country):
  return px.line(df, x="date", y=selected_country)

app.run_server(mode="inline")

From the dropdown, select Japan and Canada to display. It seems that Canada is also on the rise.

image.png

Next, when I add the United States, it's not really comparable to Japan. .. After all there is an impact when looking at the graph. I want the vaccine to be effective (though not for each person).

image.png

Display on map

Let's display a scatter plot on the map using the same data.

Preparation and data

We use a map service called Mapbox. An access token is required to use it. If you do not have an account, sign up below to get an access token. https://account.mapbox.com/ At a minimum, you only need your ID, password, and email address.

This time we will store the token in Google drive. If you want to run it yourself, you can also embed it in your code as a string.

Here, as an example, upload the text file mapbox-token.txt with the contents of the Mapbox token pasted directly under My Drive on Google drive.

image.png

Mount Google drive and load the Mapbox token. The OAuth token at the time of mounting is displayed when you jump to the page of the URL displayed at the time of execution, so enter it by copy and paste.

from google.colab import drive
drive.mount('/content/drive')

f = open('/content/drive/My Drive/mapbox-token.txt', 'r')
MAPBOX_TOKEN = f.read()
f.close()

Importing Jupyter dash is similar to a time series graph.

! pip install jupyter_dash
! pip install --upgrade plotly
import dash 
from jupyter_dash import JupyterDash 
import dash_core_components as dcc 
import dash_html_components as html 
import plotly.express as px
from dash.dependencies import Input, Output

Corona infected person data is also acquired.

! wget https://github.com/CSSEGISandData/COVID-19/raw/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv

Load the downloaded csv into the data frame.

df_map = pd.read_csv("./time_series_covid19_confirmed_global.csv")
df_map

This time, we will only collect data for each region, leaving the latitude and longitude and not transposing.

df_map = df_map.groupby(['Country/Region'], as_index=False).sum()
df_map

image.png

Now that we are ready, we will display it on a map.

Display on map

The date and color specification criteria are selectable and displayed. I refer to this. https://qiita.com/banquet_kuma/items/e02ba60661cf91af37de

app = JupyterDash(__name__)

color_opt = [dict(label=x, value=x) for x in df_map.columns]
del color_opt[2]
del color_opt[1]
date_opt = color_opt.copy()
del date_opt[0]

app.layout = html.Div(
    [
        html.Div(
            [
                html.P(["date:", dcc.Dropdown(id='date', options=date_opt)]),
                html.P(["color:", dcc.Dropdown(id='color', options=color_opt)]),
            ],
            style={"width": "20%", "float": "left"}
        ),
        dcc.Graph(id="graph", style={"width": "80%", "display": "inline-block"}),
    ]
)

@app.callback(Output("graph", "figure"), [Input("date", "value"), Input("color", "value")])
def update_graph(date, color):
  px.set_mapbox_access_token(MAPBOX_TOKEN)
  if not color:
    color = date
  return  px.scatter_mapbox(df_map,
                        lat="Lat",
                        lon="Long",
                        color=color,
                        size=date,
                        size_max=20,
                        zoom=0,
                        center={'lat': 35, 'lon': 135},
                        title="Number of people infected with corona in each country",
                        color_continuous_scale=px.colors.diverging.BrBG,
                        hover_name=date)

app.run_server(mode="inline")

When executed, it looks like this. The display is beautiful. image.png

Select a date to see the cumulative number of infected people at that time. 2020-12-01 looks like this. image.png

The display is done by a function called plotly.express.scatter_mapbox. I couldn't find a Japanese explanation for the parameters of plotly.express.scatter_mapbox, so I'll give a brief explanation for your reference (if you know a good source, please let me know in the comments).

https://plotly.github.io/plotly.py-docs/generated/plotly.express.scatter_mapbox.html

There are many. ..

The remaining challenges

Here are some issues.

in conclusion

It's easy to implement with a little visualization, and it's easy to publish on the Internet with Google Colab. Processing csv may be the most troublesome.

I would like to add visualization on the map. (→ 2020-12-27 Map display added, there are remaining issues) It seems that we can still visualize various things, so we may do it soon.

I found out by running it, but the graph does not remain when Google Colab is stopped, so if you need to publish it, you may want to use other methods (matplotlib, plotly, etc.) for Heroku or notebooks. Maybe.

Related / reference URL

Dash related
Data related
Mapbox display related

Recommended Posts

Plotly Dash on Google Colab
Play with Turtle on Google Colab
Machine learning with Pytorch on Google Colab
Image segment using Oxford_iiit_pet on Google Colab
I tried running YOLO v3 on Google Colab
Google Colab Tips Organize
Use music21 on Google Colaboratory
Try StyleGAN on Google Colaboratory
About learning with google colab
Real-time graphs on Plotly (Python)
An error that stumbled upon learning YOLO on Google Colab
Deep Learning with Shogi AI on Mac and Google Colab
Pandas 100 knocks on Google Colaboratory
Deep Learning with Shogi AI on Mac and Google Colab Chapter 11
Deep Learning with Shogi AI on Mac and Google Colab Chapters 1-6
Deep Learning with Shogi AI on Mac and Google Colab Chapter 7
Deep Learning with Shogi AI on Mac and Google Colab Chapter 10 6-9
Deep Learning with Shogi AI on Mac and Google Colab Chapter 10
PyPI package for super easy use of Cotoha on Google colab
Deep Learning with Shogi AI on Mac and Google Colab Chapter 7 5-7
Deep Learning with Shogi AI on Mac and Google Colab Chapter 9
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 3
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 3
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 1-2
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 3
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 3 ~ 5
Deep Learning with Shogi AI on Mac and Google Colab Chapter 7 9
Deep Learning with Shogi AI on Mac and Google Colab Chapter 8 5-9
Deep Learning with Shogi AI on Mac and Google Colab Chapter 8 1-4
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 3
Deep Learning with Shogi AI on Mac and Google Colab Chapter 7 8
Deep Learning with Shogi AI on Mac and Google Colab Chapter 7 1-4
Show grass on Google Nest Hub
Use ndb.tasklet on Google App Engine
Run Keras on Google Colaboratory TPU
Learn with Shogi AI Deep Learning on Mac and Google Colab Use Google Colab
Deep Learning on Mac and Google Colab Words Learned with Shogi AI