[PYTHON] Use of past meteorological data 3 (Time-series heat map display of precipitation during heavy rain)

The Japan Meteorological Agency will provide historical weather data free of charge until the end of March 2020. (Reference) Usage environment of past weather data https://www.data.jma.go.jp/developer/past_data/index.html

The basic weather data is "Anyone can use it regardless of the purpose and target of use", so we will do something using the weather data.

You can also download it from the "Past Meteorological Data Download" page of the Japan Meteorological Agency, but it is very convenient because you can download it all at once. The available data is listed below. https://www.data.jma.go.jp/developer/past_data/data_list_20200114.pdf The deadline is coming soon, so if you need it, download it early.

This time, I will display the transition of rainfall at the time of disaster as a heat map on the map. We assumed typhoon No. 19 last year, but since it is only released until July 2019, we will use the data for the heavy rain in July 2018. I used folium (0.10.1) for drawing the map.

Data download

Use "Regional Meteorological Observation (AMEDAS)"-"Hourly / Daily Value" to obtain the hourly precipitation at each AMeDAS point. This file contains observation information such as precipitation and temperature at each AMeDAS site. Check the following for the file format. http://data.wxbc.jp/basic_data/kansoku/amedas/format_amedas.pdf

Download the file to the amedas folder. It takes time because it has a capacity of about 2GB.

import os
import urllib.request

#Ground weather observation hourly / daily value file download
url    = 'http://data.wxbc.jp/basic_data/kansoku/amedas/hourly_daily_1976-2019_v191121.tar'
folder = 'amedas'
path   = 'amedas/hourly_daily_1976-2019_v191121.tar'
#Create folder
os.makedirs(folder, exist_ok=True)
if not os.path.exists(path):
    #download
    urllib.request.urlretrieve(url, path)

Since the file is in tar format, check the contents of the file.

#File confirmation
import tarfile

#Check the files included in the tar file
with tarfile.open(path, 'r') as tf:
    for tarinfo in tf:
        print(tarinfo.name)

The contents were a tar.gz file like hourly_daily_1976.tar.gz. It is hardened to tar.gz every year. Further expansion is needed. Let's check the contents of the tar.gz file.

#File confirmation 2
import tarfile

#Check the files included in the tar file
with tarfile.open(path, 'r') as tf:
    for tarinfo in tf:
        print(tarinfo.name)
        if tarinfo.isfile():
            # tar.Check files included in gz files
            with tarfile.open(fileobj=tf.extractfile(tarinfo), mode='r') as tf2:
                for tarinfo2 in tf2:
                    if tarinfo2.isfile():
                        print('\t' + tarinfo2.name)

You can see that the following files exist. hourly_daily_1976/area57/ahd1976.57246 The file name has the following format. ahdYYYY.SSSSS (YYYY: year, SSS: observatory number)

Reading precipitation data

Precipitation at AMeDAS sites nationwide from July 5th to 8th, 2018 will be acquired. Read the tar file and tar.gz without extracting them. One day's worth of each file is 1300 bytes. I will skip it until July 4th. Exclude points where only snowfall is measured. In addition, data other than the normal value (RMK is 8) is treated as missing.

Prepare a pandas data frame for storing data. Data is stored with the date and time as rows and the observatory number as columns.

#Create data frame for data storage
import pandas as pd

prec_df = pd.DataFrame()
#Precipitation data acquisition
import tarfile
import struct

#July 5-8, 2018
year = '2018'
month = 7
start_date = 5
end_date   = 8
head_size = 28
time_size = 48
day_size  = 120
r_size = 1300

#Get the files contained in the tar file
with tarfile.open(path, 'r') as tf:
    for tarinfo in tf:
        if tarinfo.isfile():
            #Extract files for 2018
            if tarinfo.name[13:17] == year:
                # tar.Get the files contained in the gz file
                with tarfile.open(fileobj=tf.extractfile(tarinfo), mode='r') as tf2:
                    for tarinfo2 in tf2:
                        if tarinfo2.isfile():
                            print(tarinfo2.name)
                            #Open file
                            with tf2.extractfile(tarinfo2) as tf3:
                                #Skip to date
                                buff = tf3.read((month-1)*31*r_size)
                                buff = tf3.read((start_date-1)*r_size)
                                #Read by date
                                for i in range(start_date, end_date+1):
                                    buff = tf3.read(head_size)
                                    sta_no, sta_type, year, monte, date = struct.unpack('<ih16xhhh', buff)
                                    print(sta_no, sta_type, year, month, date)
                                    #Observatory with only snowfall is not applicable
                                    if sta_type != 0:
                                        for time in range(24):
                                            buff = tf3.read(time_size)
                                            prec, rmk = struct.unpack('<hh44x', buff)
                                            #Missing confirmation
                                            if rmk != 8:
                                                prec_df.loc['{:04}{:02}{:02}{:02}'.format(year, monte, date, time), str(sta_no)] = None
                                            else:
                                                prec_df.loc['{:04}{:02}{:02}{:02}'.format(year, monte, date, time), str(sta_no)] = prec * 0.1
                                        buff = tf3.read(day_size)
                                    else:
                                        #Skip the rest of the data
                                        buff = tf3.read(r_size-head_size)

It will take some time, but I was able to get the precipitation data. Let's check the data.

prec_df

Check the number of defects for reference.

prec_df.isnull().values.sum()

There were 492 defects.

Obtain latitude and longitude of the observatory

AMeDAS point information file download

Download the AMeDAS location information file.

import os
import urllib.request

#AMeDAS point information file download
url    = 'http://data.wxbc.jp/basic_data/kansoku/amedas/amdmaster_201909.tar.gz'
folder = 'amedas'
path   = 'amedas/amdmaster_201909.tar.gz'
#Create folder
os.makedirs(folder, exist_ok=True)
if not os.path.exists(path):
    #download
    urllib.request.urlretrieve(url, path)

Check the file.

#File confirmation
import tarfile

#Check the files included in the tar file
with tarfile.open(path, 'r') as tf:
    for tarinfo in tf:
        print(tarinfo.name)

Reading the latitude and longitude of the observatory

The history of observation points is stored in amdmaster.index. Since the history of each point, you may get the latitude and longitude of the observation date, but on the display, the latest latitude and longitude are used as the latitude and longitude of the observation point because the movement of the observation point is within the margin of error. Store in a pandas dataframe.

#Creating a data frame for storing AMeDAS point information
import pandas as pd

amedas_df = pd.DataFrame()
#Obtain latitude and longitude of the observatory
with tarfile.open(path, 'r') as tf:
    for tarinfo in tf:
        if tarinfo.name == 'amdmaster_201909/amdmaster.index':
            #Open file
            with tf.extractfile(tarinfo) as tf2:
                #Skip header
                line = tf2.readline()
                line = tf2.readline()
                line = tf2.readline()
                while line:
                    #Get the latest latitude and longitude
                    prec_no = int(line[0:5])
                    amedas_df.loc[prec_no, 'Observatory name'] = line[6:26].decode('shift_jis').replace(" ", "")
                    amedas_df.loc[prec_no, 'latitude'] = int(line[74:76]) + float(line[77:81])/60
                    amedas_df.loc[prec_no, 'longitude'] = int(line[82:85]) + float(line[86:90])/60
                    line = tf2.readline()
            break

Observatory latitude and longitude display

amedas_df

Data creation for heat map display

Create and add a list of [latitude, longitude, weight] for each point. For weight, specify the amount of precipitation. However, we need to specify a value of 0 <weight <= 1, so we will divide the precipitation by 100. If the amount of precipitation is 100 mm or more, it is calculated as 100 mm. Also, if the precipitation is 0 or missing, it is not included in the point. List Latitude, Longitude, Precipitation by hour. It will be a three-dimensional list as shown below. [[[Latitude, longitude, precipitation]]] In index, the date and time for time series display are stored.

import numpy as np

#Latitude, longitude, and precipitation data generation at each point for each hour
data = []
index = []
for time in prec_df.index:
    point = []
    index.append(time)
    for sta_no in prec_df.columns:
        #print(sta_no, time)
        #Not applicable in case of defect
        if np.isnan(prec_df.loc[time, sta_no]): continue
        #Excludes no rainfall
        if prec_df.loc[time, sta_no] == 0: continue
        #Get latitude and longitude
        latitude =  None
        longitude = None
        day = time[0:8]
        latitude = amedas_df.loc[int(sta_no), 'latitude']
        longitude = amedas_df.loc[int(sta_no), 'longitude']
        #Precipitation is 1 to range from 0 to 1 for heatmaps/100 If it is 100 mm or more, it shall be 100 mm.
        prec = min(prec_df.loc[time, sta_no], 100) / 100
        point.append([latitude, longitude, float(prec)])
    data.append(point)

Check the data.

data

Time series heat map display

Use folium to display a time series heat map. Use the HeatMapWithTime function. Refer to the following for the parameters. https://python-visualization.github.io/folium/plugins.html The color changes to blue, lime, yellow, orange, and red, depending on the amount of precipitation.

The center point is Hiroshima.

import folium
from folium import plugins

#Display Amedas points on a map
#Center point setting
sta_no = 67437 #Hiroshima
latitude = amedas_df.loc[int(sta_no), 'latitude']
longitude = amedas_df.loc[int(sta_no), 'longitude']
map = folium.Map(location=[latitude, longitude], zoom_start=7)

#Heat map display
hm = plugins.HeatMapWithTime(data, index, radius=100, min_opacity=0, max_opacity=1, auto_play=True,
                             gradient={0.1: 'blue', 0.25: 'lime', 0.5:'yellow', 0.75:'orange',1.0: 'red'})
hm.add_to(map)
map

Precipitation data is displayed as a heat map on the map as shown below. The display changes every hour.

heatmap.png

On the heat map display of folium, the data value looks large when the observation points are close. Since it is based on the AMeDAS point, it is not as accurate as the rain cloud radar, but you can see the rough changes. It may be easier to understand if you increase the speed to 10 fps.

You can also save it in html and display it.

#Save map
map.save(outfile="prec20180705_08_map.html")

I tried to display the amount of precipitation during heavy rain on a time series heat map.

The data will be released until the end of March 2020, so if you need it, we recommend you to download it as soon as possible.

Recommended Posts

Use of past meteorological data 3 (Time-series heat map display of precipitation during heavy rain)
Use of past meteorological data 1 (Display of AMeDAS points)
Use of past meteorological data 1 (Display of AMeDAS points)
Use of past meteorological data 3 (Time-series heat map display of precipitation during heavy rain)
Let's use the open data of "Mamebus" in Python
Try scraping the data of COVID-19 in Tokyo with Python
Data analysis based on the election results of the Tokyo Governor's election (2020)
Use PyCaret to predict the price of pre-owned apartments in Tokyo!
Explain the mechanism of PEP557 data class
The story of verifying the open data of COVID-19
Get the column list & data list of CASTable
Visualize the export data of Piyo log
[Django] Google map display of GIS data and graphing of parameters