In this article, I would like to take images as time series data and use convolutional LSTM to predict future images. I thought that convLSTM has few articles and implementation examples (maybe because it is not accurate), so I would like to publish it though it is a quick code. Since it is the main implementation, I think that convolution Lstm is detailed about the structure of convLSTM.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from sklearn.model_selection import train_test_split
import glob
from PIL import Image
from tqdm import tqdm
import zipfile
import io
The image used was the satellite image used in the previous article (I tried cluster analysis of the weather map). However, there was a competition like that [SOTA] Weather Challenge: Cloud Image Prediction, so I think it would be convenient to use this dataset while paying attention to the rules. .. In this article, we will consider a model that predicts the next day's image from 5 images every 24 hours.
This code was run in google colab, so the image is given as a zip file. Therefore, it needs to be decompressed. Also, since the original image is very large in size, the image size is reduced for simplicity.
#Image size after reduction
height = 100
width = 180
#Array to put the loaded image
imgs=np.empty((0, height, width, 3))
#Read a zip file into a numpy array
zip_f = zipfile.ZipFile('drive/My Drive/Colab Notebooks/convLSTM/wide.zip')
for name in tqdm(zip_f.namelist()):
with zip_f.open(name) as file:
path = io.BytesIO(file.read()) #Defrost
img = Image.open(path)
img = img.resize((width, height))
img_np = np.array(img).reshape(1, height, width, 3)
imgs = np.append(imgs, img_np, axis=0)
As it is, the data is just lined up as it is, so make it a form that can be processed as time series data. The size is x (number of samples, time series length, height, width, number of channels) and y is (number of samples, height, width, number of channels).
#Arrange in a format that can be learned in chronological order
n_seq = 5
n_sample = imgs.shape[0] - n_seq
x = np.zeros((n_sample, n_seq, height, width, 3))
y = np.zeros((n_sample, height, width, 3))
for i in range(n_sample):
x[i] = imgs[i:i+n_seq]
y[i] = imgs[i+n_seq]
x, y = (x-128)/128, (y-128)/128
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.1, shuffle = False)
Create a model. It's similar to the convolution layer, but with the addition of return_sequences
as a parameter. This is whether to do time series data and return the data, only the last convLSTM layer is set to False.
(Since I was straying, such as trying Shortcut Connection and Skip Conection in the process of adjusting the model, I have to use the functional API, but Sequential is enough)
from keras import layers
from keras.layers.core import Activation
from tensorflow.keras.models import Model
inputs = layers.Input(shape=(5, height, width, 3))
x0 = layers.ConvLSTM2D(filters=16, kernel_size=(3,3), padding="same", return_sequences=True, data_format="channels_last")(inputs)
x0 = layers.BatchNormalization(momentum=0.6)(x0)
x0 = layers.ConvLSTM2D(filters=16, kernel_size=(3,3), padding="same", return_sequences=True, data_format="channels_last")(x0)
x0 = layers.BatchNormalization(momentum=0.8)(x0)
x0 = layers.ConvLSTM2D(filters=3, kernel_size=(3,3), padding="same", return_sequences=False, data_format="channels_last")(x0)
out = Activation('tanh')(x0)
model = Model(inputs=inputs, outputs=out)
model.summary()
The details of the model are like this
Model: "functional_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 5, 100, 180, 3)] 0
_________________________________________________________________
conv_lst_m2d (ConvLSTM2D) (None, 5, 100, 180, 16) 11008
_________________________________________________________________
batch_normalization (BatchNo (None, 5, 100, 180, 16) 64
_________________________________________________________________
conv_lst_m2d_1 (ConvLSTM2D) (None, 5, 100, 180, 16) 18496
_________________________________________________________________
batch_normalization_1 (Batch (None, 5, 100, 180, 16) 64
_________________________________________________________________
conv_lst_m2d_2 (ConvLSTM2D) (None, 100, 180, 3) 2064
_________________________________________________________________
activation (Activation) (None, 100, 180, 3) 0
=================================================================
Total params: 31,696
Trainable params: 31,632
Non-trainable params: 64
Let's learn. In the case of colab, if the batch size is increased, the memory usage will be exceeded, so it is reduced. (I want to buy a very high-performance machine and be able to run it locally ...)
model.compile(optimizer='rmsprop',
loss='mae', metrics=['mse'])
call_backs=[EarlyStopping(monitor="val_loss",patience=5)]
model.fit(x_train, y_train, batch_size=16, epochs=100, verbose=2, validation_split=0.2, shuffle=True, callbacks=call_backs)
The running loss looks like this. It doesn't feel very good ...
Let's display the execution result in the figure.
#drawing
%matplotlib inline
i=15
fig, axes = plt.subplots(1, 2, figsize=(12,6))
axes[0].imshow((y_test[i]+1)/2)
axes[1].imshow((model.predict(x_test[[i]]).reshape(100,180,3)+1)/2)
The correct image and the predicted image are displayed side by side. i=0 i=20
Looking at this result, it turned out to be very vague. This may be due to the fact that the average score is higher when ambiguous than when the result is clear. I think there is a possibility that it can be improved by changing the loss function to another one, or by predicting the image several hours later, which is likely to give more accurate prediction.
A competition to predict cloud images similar to this article has been held, and many efforts to improve accuracy will be helpful. Although it is implemented in chainer, I think that it will be helpful as there is sample code in the forum. -Weather Challenge: Cloud Image Prediction
Recommended Posts