[PYTHON] As hot days continue, try temperature prediction using gluon TS (LSTM)

How long will this heat last?

Is it possible to predict the future from past temperatures? To distract me, I tried the Gluon toolkit ** Gluon TS ** for stochastic time series modeling.

Installation

To use it, you need the mxnet and gluon libraries (see this article). If you have installed Python with anaconda etc., you can install it immediately with pip.

$pip install -U pip
$pip install mxnet
$pip install gluonts

If pip itself is old, an error may occur when executing the source code after this, so it is a good idea to update pip itself.

Data acquisition

I want to get the temperature data as a csv file, so I get the data from the Past Meteorological Data Download Site of the Japan Meteorological Agency. I decided to do it.

As an example, I downloaded the data for 2 years from August 14, 2018 to August 14, 2020 in Saga City, Saga Prefecture. I managed to get the data for two years in order to remember the yearly transition.

# Confirmed operation on jupyter notebook
import pandas as pd
import datetime
%matplotlib inline
 
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# First, read the temperature data of Saga Prefecture
 tempera_data = pd.read_excel ("saga_weather_data_20200815.xlsx") # * Modified to excel file for easy processing
 tempera_data ["time"] = pd.to_datetime (tempera_data ['date'])

Currently, the data looks like this.

We will use the "time" and "average temperature (℃)" columns of this DataFrame.

 analysis_data = tempera_data [["time",'average temperature (℃)']]

Let's visualize what kind of data it is.

plt.figure(figsize=(15, 5))
 plt.plot (analysis_data ["time"], analysis_data ["average temperature (℃)"])
plt.grid(True)
plt.show()

In order to predict the latest 7 days, we decided to set learning: 1st to 725th and evaluation: 726th. It is the orange part in the figure below.

tmp_time = np.arange(0, len(analysis_data))
 analysis_data ["re_time"] = tmp_time # Convert time information
plt.figure(figsize=(15, 5))
 
# Area you want to infer (shaded part)
plt.axvspan(725,len(tmp_time),color="orange")
 
 plt.plot (tmp_time, analysis_data ["average temperature (℃)"])
plt.xlabel("[day]")
plt.grid(True)
plt.show()

Learning and evaluation

Define learning and evaluation data.

from gluonts.dataset.common import ListDataset

make_train
predict_length = 7
training_data = ListDataset(
    [{"start": analysis_data["time"].values[0], "target": analysis_data.iloc[:len(tmp_time)-predict_length, 1]}],
    freq = "24H")

make_test
test_data = ListDataset(
    [{"start": analysis_data["time"].values[0], "target": analysis_data.iloc[:len(tmp_time), 1]}],
    freq = "24H")

Define an estimator for learning. The parameters were set by referring to the reference article and Official Tutorial.

from gluonts.model.simple_feedforward import SimpleFeedForwardEstimator
from gluonts.trainer import Trainer
 
estimator = SimpleFeedForwardEstimator(freq="24H",
                            context_length=20,
                            prediction_length=10,
                            trainer=Trainer(epochs=300,
                                            batch_size=32,
                                            learning_rate=0.001))
 
predictor = estimator.train(training_data=training_data)

Let's visualize what kind of inference result will be.

from gluonts.dataset.util import to_pandas
 
for test_entry, forecast in zip(test_data, predictor.predict(test_data)):
    plt.figure(figsize=(15, 5))
    to_pandas(test_entry).plot(linewidth=2)
    forecast.plot(color='g', prediction_intervals=[50.0, 90.0])
 
plt.legend(["observations", "median prediction", "90% confidence interval", "50% confidence interval"], 
          loc='lower left')
plt.grid(which='both')

Although there is a lot of variation, it seems that we can make some predictions.

from now on

I would like to try various ways to use GluonTS in order to infer more accurately (I would be grateful if you could give me advice in the comments etc.).

Reference article

Thank you very much!

Time series data prediction with MXNet and LSTM -From introduction to practice-: https://cpp-learning.com/mxnet_gluonts/ → The explanation is polite, and it is an article that challenges to predict the timing when the enemy shoots Hadoken.
Gluon Quick Start Tutorial :https://gluon-ts.mxnet.io/examples/basic_forecasting_tutorial/tutorial.html → It was helpful in the part shown.