[PYTHON] [Visualization with folium] I feel that FamilyMart has increased too much in recent years.

This is the 21st day article of MYJLab Advent Calendar 2019. Thatchy will be in charge. (Sorry for being late)

This time

To do

Recently, there is a phenomenon in my hometown where there are FamilyMarts across the road. In order to visualize this phenomenon, I would like to reduce the increase in the number of FamilyMart stores to a time-series heat map for each prefecture.

What to use

There is a library called Leaflet that can draw a nice map with JavaScript. This time, I will create a time series heatmap using a library called folium that allows you to use Leaflet in Python.

Draw immediately

Execution environment

Since the function and usage of folium changes depending on the version, be sure to set the version to 0.10.1 this time.

#Library load
import pandas as pd
import folium
from folium import plugins

Data preparation

We have prepared the data by re-converting the number of FamilyMart stores in each prefecture from 1999 to 2019 on the here site into a csv file. .. It can be dropped with Python Requests. In addition, the latitude and longitude of each prefecture are required when drawing on the map. This time, the location where the prefectural office is located is the latitude and longitude of each prefecture. The data can be downloaded from here.

#Data on the number of stores
import requests
import io

URL = "https://drive.google.com/uc?id=1-8tppvHwwVJWufYVskTfGz7cCrBIE0SM"
r = requests.get(URL)
famima_data = pd.read_csv(io.BytesIO(r.content))
famima_data.head()
スクリーンショット 2019-12-21 15.22.03.png

Due to the increase in recent years, the number of missing values has increased considerably from 1999 to 2006.

#Latitude and longitude of the prefectural office
geo_data = pd.read_csv("./data/prefecturalCapital.csv")
geo_data.head()
スクリーンショット 2019-12-21 15.29.01.png

Then combine the two data frames. I want to combine using id as a key, so I change the id of geo_data to 0 and combine. The missing value is set to 0 once.

import numpy as np

geo_data.id = geo_data.id - 1
merged_data = pd.merge(famima_data, geo_data[["id", "lat", "lon"]], on=["id"])
merged_data = merged_data.replace(np.nan, 0)
merged_data.head()
スクリーンショット 2019-12-21 18.02.45.png

The basic data is ready.

Convert to increasing data

This time, I want to visualize the change in the number of stores, so I will take the difference in the column.

#Get an array of time series column names
time_columns = merged_data.columns[2:23].values

#Only the data part of the number of stores is diffed_Let it be data
merged_data.loc[:, time_columns] = merged_data.loc[:, time_columns].astype(float)
diff_data = merged_data.copy()
diff_data.loc[:, time_columns]  = merged_data.loc[:, time_columns].diff(axis=1)

#Since the data for 1999 will be lost, delete it.
diff_data = diff_data.dropna(axis=1)
time_columns = time_columns[1:]

diff_data.head()
スクリーンショット 2019-12-21 19.06.26.png

When the difference is obtained, perform min-max-scaling. In the folium heatmap, 0 is inconvenient, so add 1e-4 to the whole.

# diff_Scale data and scaled_Let it be data
scaled_data = diff_data.copy()

scaled_data.loc[:, time_columns] = (diff_data.loc[:, time_columns] - diff_data.loc[:, time_columns] .min().min()) / (diff_data.loc[:, time_columns] .max().max() - diff_data.loc[:, time_columns] .min().min())
scaled_data.loc[:, time_columns] = scaled_data.loc[:, time_columns] + 1e-4
scaled_data.head()
スクリーンショット 2019-12-21 19.11.27.png

Finally, to draw a time series heatmap Create 3D data that will be [[[latitude, longitude, data] * 47 prefectures] * 1999 ~ 2019].

heat_map_data = [[[row['lat'],row['lon'], row[idx]] for index, row in scaled_data.iterrows()] for idx in time_columns]

#Since the shape of the data is difficult to understand, only the first one is output
heat_map_data[0]
#output
[[43.064359, 141.347449, 0.051760516605166056],
 [40.824294, 140.74005400000001, 0.051760516605166056],
 [39.70353, 141.15266699999998, 0.05545055350553506],
 [38.268737, 140.872183, 0.060985608856088565],
 [39.718175, 140.10335600000002, 0.051760516605166056],
 [38.240127, 140.362533, 0.07390073800738008],
 [37.750146, 140.466754, 0.0923509225092251],
 [36.341817, 140.446796, 0.04807047970479705],
 [36.56575, 139.883526, 0.05545055350553506],
 [36.391205, 139.060917, 0.060985608856088565],
 [35.857771, 139.647804, 0.060985608856088565],
 [35.604563, 140.123179, 0.04807047970479705],
 [35.689184999999995, 139.691648, 0.0997309963099631],
 [35.447505, 139.642347, 0.06467564575645757],
 [37.901699, 139.022728, 0.051760516605166056],
 [36.695274, 137.211302, 0.06467564575645757],
 [36.594729, 136.62555, 0.06467564575645757],
 [36.065220000000004, 136.221641, 0.06283062730627306],
 [35.665102000000005, 138.568985, 0.05545055350553506],
 [36.651282, 138.180972, 0.051760516605166056],
 [35.39116, 136.722204, 0.05729557195571956],
 [34.976987, 138.383057, 0.05729557195571956],
 [35.180246999999994, 136.906698, 0.07574575645756458],
 [34.730546999999994, 136.50861, 0.06836568265682658],
 [35.004532, 135.868588, 0.05360553505535055],
 [35.020996200000006, 135.7531135, 0.05360553505535055],
 [34.686492, 135.518992, 0.0978859778597786],
 [34.69128, 135.183087, 0.08128081180811808],
 [34.685296, 135.832745, 0.04622546125461255],
 [34.224806, 135.16795, 0.08866088560885609],
 [35.503463, 134.238258, 0.051760516605166056],
 [35.472248, 133.05083, 0.051760516605166056],
 [34.66132, 133.934414, 0.060985608856088565],
 [34.396033, 132.459595, 0.06836568265682658],
 [34.185648, 131.470755, 0.051760516605166056],
 [34.065732000000004, 134.559293, 0.051760516605166056],
 [34.340140000000005, 134.04297, 0.051760516605166056],
 [33.841649, 132.76585, 0.051760516605166056],
 [33.55969, 133.530887, 0.051760516605166056],
 [33.606767, 130.418228, 0.060985608856088565],
 [33.249367, 130.298822, 0.05360553505535055],
 [32.744541999999996, 129.873037, 0.10526605166051661],
 [32.790385, 130.742345, 0.06652066420664207],
 [33.2382, 131.612674, 0.05914059040590406],
 [31.91109, 131.423855, 0.05729557195571956],
 [31.560219, 130.557906, 0.0868158671586716],
 [26.211538, 127.68111499999999, 0.07759077490774909]]

Draw

japan_map = folium.Map(location=[35, 135], zoom_start=6)
hm = plugins.HeatMapWithTime(heat_map_data, index=list(time_columns),auto_play=False,radius=30,max_opacity=1,gradient={0.1: 'blue', 0.25: 'lime', 0.5:'yellow',0.75: 'orange', 0.9:'red'})
hm.add_to(japan_map)

japan_map

画面収録-2019-12-21-19.37.12.gif

Only increments are visible as less than about 0.052 should be negative.

I'm happy with the moving map ...! !! !!

Consideration

the end

I found it very convenient to be able to visualize the transition of data for each time series without writing so much code. I want to use it for something more meaningful. Until the end Thank you for reading. I would appreciate it if you could comment on any corrections.

Reference, source

Recommended Posts

[Visualization with folium] I feel that FamilyMart has increased too much in recent years.
Note that I dealt with HTML in Beautiful Soup
Work memo that I tried i18n with Flask app