Find the white Christmas rate by prefecture with Python and map it to a map of Japan

Introduction

It's fantastic that it snows at Christmas, but it doesn't snow very well when I live in Tokyo. But what about Hokkaido? In Tohoku? So, let's find ** "White Christmas rate by prefecture" ** using the weather data by prefecture for the past 50 years. Also, I will plot the result on a map of Japan.

What you can learn from this article

――How to proceed with data analysis, way of thinking --DataFrame operations for data cleansing --Drawing a map of Japan with japanmap

etc...

environment

analysis

Prerequisites

In this verification, White Christmas is defined as ** 12/24 or 25 nights, even temporarily snowy days **.

In addition, although there are multiple weather observation points in one prefecture, basically ** information on the location of the prefectural office is used **. However, this does not apply if the observation point does not exist at the prefectural capital. (As a result of the investigation, Saitama prefecture and Shiga prefecture were applicable, so we used the information of Kumagaya and Hikone, respectively.)

Data collection

Past weather data will be borrowed from the website of the Japan Meteorological Agency below. https://www.data.jma.go.jp/gmd/risk/obsdl/index.php

Select 47 points as described above. I downloaded the CSV with the item "Weather overview (night: 18:00 to 06:00 the next day)" and the period "Display daily values ​​from December 24 to December 25 from 1970 to 2019". It was.

Also, eliminate the extra lines at the top on Excel and format them in the following format. スクリーンショット 2020-12-07 9.58.04.png

The file name is ** Xmas.csv **.

Calculation of white Christmas rate

Now, let's process the previous data and calculate the White Christmas rate. First, read CSV.

import pandas as pd
from datetime import datetime

df_xmas_master = pd.read_csv("Xmas.csv", encoding="shift_jis", index_col="Unnamed: 0", parse_dates=True)

First, delete the extra lines that are lined up with "8" and "1" from csv. This is OK if you select every other row as shown below.

df_xmas = df_xmas_master.iloc[:,::3]

Then replace the cells with "snow" in the weather with True and the cells without it with False.

for i in df_xmas.columns:
    df_xmas[i] = df_xmas[i].str.contains("snow")

Then, the data frame is as follows. スクリーンショット 2020-12-08 18.34.23.png

If you come to this point, it seems that you should add up the rest.

df_white_rate = df_xmas.resample("Y").max().mean()

df_white_rate = df_white_rate.to_frame().reset_index()
df_white_rate.columns = ["Prefectural office location","White Xmas rate"]

Considering that True = 1 and False = 0, if you aggregate by year and calculate the maximum value, the weather of ** 12/24 or 12/25 includes even one "snow". If it is, it will be calculated as 1, and if none of them contain "snow", it will be calculated as 0 **.

Then do mean () to find out the percentage of snowfall each year. (If the weather is not obtained = NaN is included, it will be omitted from the calculation.)

By the way, in the bottom two lines of the source code, if you do mean (), the format will be Series type, so to make it easier to handle in the future, I just changed it to the data frame format and defined a new column name.

By this process, df_white_rate becomes as follows. スクリーンショット 2020-12-08 18.42.38.png

You have already calculated the annual White Christmas rate. Now, let's map this result to a map of Japan in a good way.

Creating a white Christmas rate data frame by prefecture

By the way, the result is illustrated on the map of Japan, but for this we use a convenient library called ** japanmap **. This is an excellent product that allows you to obtain a colored map of Japan by simply giving a list of "prefecture names" and "colors" in Series type **.

But here is the problem. The data frame created earlier is not the "prefecture name" but the "prefectural office location name". ** The prefectural office location name and the prefecture name must be linked. ** **

So, I will borrow the prefecture name-prefectural office location name table from the following site.

https://sites.google.com/site/auroralrays/hayamihyou/kenchoushozaichi

The source is below.

df_center_master = pd.read_html("https://sites.google.com/site/auroralrays/hayamihyou/kenchoushozaichi")

df_center = df_center_master[2].iloc[2:,1:]
df_center[2] = df_center[2].str[:-1]
df_center = df_center.reset_index().iloc[:,1:]

df_center.columns = ["Prefectures","Prefectural office location"]

df_center.iloc[10]["Prefectural office location"] = "Kumagaya"
df_center.iloc[12]["Prefectural office location"] = "Tokyo"
df_center.iloc[24]["Prefectural office location"] = "Hikone"

The previous site has "city" after the prefectural office location name, so I deleted it to match the format of the White Xmas data frame I created earlier.

Also, since the prefectural capital of Tokyo was "Shinjuku", it will be adjusted to "Tokyo". And this time, we have acquired data that is not the location of the prefectural office, such as "Kumagaya" in Saitama prefecture and "Hikone" in Shiga prefecture, so I will change that as well.

The data frame ** df_center ** created in this way is as follows.

スクリーンショット 2020-12-13 15.33.25.png

It seems good to combine this data frame with the White Xmas data frame created earlier.

df_all =  pd.merge(df_center,df_white_rate,on="Prefectural office location")
df_all = df_all[["Prefectures","White Xmas rate"]]

If you check df_all. .. ..

スクリーンショット 2020-12-13 15.36.02.png

It seems that the ** prefecture name-White Xmas rate ** data frame has been successfully completed.

Convert White Xmas rate to color

Next, I will manage to change the numerical value to a color.

Any shade is fine, but try to make it ** closer to white as the white Christmas rate is higher and closer to green as the rate is lower **.

In terms of color code, it can be rephrased as ** when the white Christmas rate is high, it approaches # ffffff, and when it is low, it approaches # 00ff00 **.

The first two digits of the color code are "red intensity 256 levels converted to hexadecimal", the middle two digits are "green intensity 256 levels converted to hexadecimal", and the last two digits are " It is a conversion of 256 levels of blue intensity into hexadecimal numbers. " Since 0 to 255 in decimal numbers becomes 00 to ff in hexadecimal numbers, it always fits in 2 digits.

So, let's create a function ** that converts numbers to green intensity **.

def num2color(x):
    color_code = "#" + format(round(x*255),"#04x")[2:] + "ff" + format(round(x*255),"#04x")[2:]
    return color_code

Right now, the numbers are in the range 0 to 1, so multiply this by 255 to change it to the range 0 to 255. It is rounded off and replaced with a 2-digit hexadecimal number. You can adjust the color from "white to green" by adjusting only the intensity of red and blue while leaving the intensity of green as it is.

Now, let's use this function to create a ** prefecture name-color code ** list.

df_all = df_all.set_index("Prefectures")
df_all = df_all["White Xmas rate"].apply(num2color)

So what happened to df_all?

スクリーンショット 2020-12-13 15.51.40.png

Prefecture name-It seems that the list of color codes has been completed.

Mapping to Japan Map

Finally, we will map it to the map of Japan. This just loads matplotlib and japanmap and gives df_all to the japanmap method.

import matplotlib.pyplot as plt
from japanmap import picture

fig = plt.subplots(figsize=(10,10))
plt.imshow(picture(df_all))

Well, the result is ... スクリーンショット 2020-12-13 15.55.13.png

This completes the map of white Christmas rates by prefecture. The color setting may not be good, but there is a problem with the sense here, so please respect me ...

in conclusion

So far, we have done the mapping of the white Christmas rate with Python.

The state of this analysis is also uploaded to the following YouTube. (The source code of this article is a new optimization of what was created in this video.) If you like, I hope you can also experience the ** realistic sense of data analysis ** here.

https://youtu.be/nu_RqJAMYTY

We hope that this article will serve as a reference for data analysis policies and visualization methods.

Finally, since it was difficult to understand the white Christmas rate in the figure, I will end by pasting the percentage of the white Christmas rate for the past 50 years by prefecture. (Visualized meaning ...)

スクリーンショット 2020-12-13 16.30.43.png

Recommended Posts

Find the white Christmas rate by prefecture with Python and map it to a map of Japan
Put Cabocha 0.68 on Windows and try to analyze the dependency with Python
Starting with Python 3.10, the form returned by inspect.signature () seems to be based on typing.get_type_hints ().
Find the white Christmas rate by prefecture with Python and map it to a map of Japan
The result of making a map album of Italy honeymoon in Python and sharing it
I tried to find the entropy of the image with python
Recursively get the Excel list in a specific folder with python and write it to Excel.
Return the image data with Flask of Python and draw it to the canvas element of HTML
[Python] A program to find the number of apples and oranges that can be harvested
Try to find the probability that it is a multiple of 3 and not a multiple of 5 when one is removed from a card with natural numbers 1 to 100 using Ruby and Python.
Convert the result of python optparse to dict and utilize it
[Python] A simple function to find the center coordinates of a circle
[Python] The role of the asterisk in front of the variable. Divide the input value and assign it to a variable
Calculate the shortest route of a graph with Dijkstra's algorithm and Python
[Introduction to Python] How to sort the contents of a list efficiently with list sort
Find the general terms of the Tribonacci sequence with linear algebra and Python
I tried to verify and analyze the acceleration of Python by Cython
Open an Excel file in Python and color the map of Japan
[Python / Jupyter] Translate the comment of the program copied to the clipboard and insert it in a new cell
I ran GhostScript with python, split the PDF into pages, and converted it to a JPEG image.
The story of making a tool to load an image with Python ⇒ save it as another name
Read the data of the NFC reader connected to Raspberry Pi 3 with Python and send it to openFrameworks with OSC
Upload data to s3 of aws with a command and update it, and delete the used data (on the way)
Compare the speed of Python append and map
A discussion of the strengths and weaknesses of Python
[Circuit x Python] How to find the transfer function of a circuit using Lcapy
[python] How to sort by the Nth Mth element of a multidimensional array
Build a python environment to learn the theory and implementation of deep learning
Get the stock price of a Japanese company with Python and make a graph
How to get a list of files in the same directory with python
[Python] Plot data by prefecture on a map (number of cars owned nationwide)
[Introduction to Python] How to get the index of data with a for statement
[Python] Wouldn't it be the best and highest if you could grasp the characteristics of a company with nlplot?
GAE --With Python, rotate the image based on the rotation information of EXIF and upload it to Cloud Storage.
[CleanArchitecture with Python] Apply CleanArchitecture step by step to a simple API and try to understand "what kind of change is strong" in the code base.
A memo connected to HiveServer2 of EMR with python
How to execute a schedule by specifying the Python time zone and execution frequency
[Python3] Take a screenshot of a web page on the server and crop it further
I tried to draw a route map with Python
Extract images and tables from pdf with python to reduce the burden of reporting
I tried to automate the article update of Livedoor blog with Python and selenium.
Visualize the range of interpolation and extrapolation with python
[Introduction to system trading] I drew a Stochastic Oscillator with python and played with it ♬
Quickly create a Python data analysis dashboard with Streamlit and deploy it to AWS
A story about getting the Atom field (XML telegram) of the Japan Meteorological Agency with Raspberry Pi and tweeting it
Image processing with Python (I tried binarizing it into a mosaic art of 0 and 1)
A memo of misunderstanding when trying to load the entire self-made module with Python3
I tried to compare the processing speed with dplyr of R and pandas of Python
It is easy to execute SQL with Python and output the result in Excel
Python Note: The mystery of assigning a variable to a variable
Add a function to tell the weather of today to slack bot (made by python)
[Python environment maintenance] De-NeoBundle. Prepare the environment of the super convenient complementary plug-in jedi-vim with dein and set it to be comfortable
Collect tweets about "Corona" with python and automatically detect words that became a hot topic due to the influence of "Corona"
I don't like to be frustrated with the release of Pokemon Go, so I made a script to detect the release and tweet it
I want to cut out only the face from a person image with Python and save it ~ Face detection and trimming with face_recognition ~
Get the matched string with a regular expression and reuse it when replacing on Python3
Python --Read data from a numeric data file to find the covariance matrix, eigenvalues, and eigenvectors
A library that monitors the life and death of other machines by pinging from Python
A concrete method of predicting horse racing by machine learning and simulating the recovery rate
[Python] I want to make a 3D scatter plot of the epicenter with Cartopy + Matplotlib!
A story that makes it easy to estimate the living area using Elasticsearch and Python
I made a class to get the analysis result by MeCab in ndarray with python
I tried to get the number of days of the month holidays (Saturdays, Sundays, and holidays) with python