[PYTHON] Sine wave prediction using RNN in deep learning library Keras

Introduction

Keras is a deep learning wrapper library based on Theano and TensorFlow. Thanks to Theano and TensorFlow, it has become much easier to get into deep learning, but it is still difficult to write algorithms. So, Keras is a library that makes it possible to write a network structure quite simply. For an overview of Keras itself, id: aidiary's article was very helpful.

As a basic sample of Keras, I see a lot of MNIST classifications, but I couldn't find many simple samples using RNNs (Keras officially Sample movie emotion classification using RNN, but it was too complicated to handle at first). Therefore, this time, I will try Keras's RNN implementation through a simple sample of letting LSTM learn a sine wave and predict it. For RNN implementation using TensorFlow, I wrote an article on here before, so if you are interested, please see that.

2017.11.20 postscript

This time, we are predicting sine waves with an emphasis on comprehensibility, but since we wanted to handle more complicated time series data, we used [Python] QRNN to predict chaos time series data [Keras]](https://qiita.com). I created an article called / yukiB / items / 681f68690ffabbf3e1e1).

Installation

Keras can use both Theano and Tensorflow as backends, and programs written in Keras can be switched backends at any time without any modification (There seems to be some caveats. / nzw0301 / items / 2823243090b997aa00e5)), but this time I will try the method using TensorFlow as the back end.

It is assumed that TensorFlow is installed in advance.

Install Keras normally using pip

pip install keras

But it ’s okay,

After git clone the source

python setup.py install

But it's fine.

When using TensorFlow as the backend, rewrite the configuration file ~ / .keras / keras.json as follows (see keras documentation. Keras.json. Is generated by first starting (importing, etc.) keras).

# before
{
    "image_dim_ordering": "th", 
    "epsilon": 1e-07, 
    "floatx": "float32", 
    "backend": "theano"
}

# after
{
    "image_dim_ordering": "th", 
    "epsilon": 1e-07, 
    "floatx": "float32", 
    "backend": "tensorflow"
}

The default backend is responsible for Theano, but with the above rewrite,


import keras

When you do

Using TensorFlow backend.

Should be displayed.

Predict the sine wave with Keras LSTM.

Data creation

First, create the data. The data was created by yuyakato's I tried to predict by letting RNN learn the sine wave I was allowed to refer to.

import pandas as pd
import numpy as np
import math
import random
%matplotlib inline
random.seed(0)
#Random number coefficient
random_factor = 0.05
#Number of steps per cycle
steps_per_cycle = 80
#Number of cycles to generate
number_of_cycles = 50

df = pd.DataFrame(np.arange(steps_per_cycle * number_of_cycles + 1), columns=["t"])
df["sin_t"] = df.t.apply(lambda x: math.sin(x * (2 * math.pi / steps_per_cycle)+ random.uniform(-1.0, +1.0) * random_factor))
df[["sin_t"]].head(steps_per_cycle * 2).plot()

Create a sine wave with noise as shown below.

Next, classify this into training data and test data, and create a data set so that the output y when there is an input X for 100 steps is the 101st step.

def _load_data(data, n_prev = 100):  
    """
    data should be pd.DataFrame()
    """

    docX, docY = [], []
    for i in range(len(data)-n_prev):
        docX.append(data.iloc[i:i+n_prev].as_matrix())
        docY.append(data.iloc[i+n_prev].as_matrix())
    alsX = np.array(docX)
    alsY = np.array(docY)

    return alsX, alsY

def train_test_split(df, test_size=0.1, n_prev = 100):  
    """
    This just splits data to training and testing parts
    """
    ntrn = round(len(df) * (1 - test_size))
    ntrn = int(ntrn)
    X_train, y_train = _load_data(df.iloc[0:ntrn], n_prev)
    X_test, y_test = _load_data(df.iloc[ntrn:], n_prev)

    return (X_train, y_train), (X_test, y_test)

length_of_sequences = 100
(X_train, y_train), (X_test, y_test) = train_test_split(df[["sin_t"]], n_prev =length_of_sequences)

Modeling

Now that the dataset is complete, it's time to write the network configuration using Keras.


from keras.models import Sequential  
from keras.layers.core import Dense, Activation  
from keras.layers.recurrent import LSTM


in_out_neurons = 1
hidden_neurons = 300

model = Sequential()  
model.add(LSTM(hidden_neurons, batch_input_shape=(None, length_of_sequences, in_out_neurons), return_sequences=False))  
model.add(Dense(in_out_neurons))  
model.add(Activation("linear"))  
model.compile(loss="mean_squared_error", optimizer="rmsprop")
model.fit(X_train, y_train, batch_size=600, nb_epoch=15, validation_split=0.05)

Only this. As mentioned above, the structure of the neural network can be constructed by adding various layers to `model``. In the above example, an input with a tensor of (, 100, 1) is thrown into 300 LSTM intermediate layers, aggregated into one output layer, and multiplied by a linear activation function.

By the way, LSTM is a 3D Shape with input Tensor (batch_size, input_length, in_data_length). The output is

return_sequences=True -> (batch_size, input_length, out_data_length)
return_sequences=False -> (batch_size, out_data_length)

This is the Shape.

When compiling the model, specify the error function (mean square error in the example) and the optimization algorithm (RMSprop in the example). Of course, cross entropy can be used for the error function, and the optimization algorithms are complete from basic SGD to Adam and RMSprop.

Training is done with fit (), and you can specify what percentage of the training data to use as training data and teacher data, batch size, epoch size, and validation data.

Also,

# early stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=2)

model.fit(X_train, y_train, batch_size=600, nb_epoch=15, validation_split=0.05, callbacks=[early_stopping])

By specifying the convergence test callback as in, the loop can be stopped automatically when it converges.

Train on 3325 samples, validate on 176 samples
Epoch 1/15
3325/3325 [==============================] - 17s - loss: 0.0051 - val_loss: 0.0048
Epoch 2/15
1200/3325 [=========>....................] - ETA: 10s - loss: 0.0041

When you start learning, the bar will show you the progress of learning such as prediction of learning time, time taken for learning each epoch, loss / accuracy of training data, loss / accuracy of validation data (as described above). Convenient!).

Forecast

Forecasting using training data

predicted = model.predict(X_test)

Like, use predict (). In this example,

dataf =  pd.DataFrame(predicted[:200])
dataf.columns = ["predict"]
dataf["input"] = y_test[:200]
dataf.plot(figsize=(15, 5))

The prediction result is as follows.

Keras also supports model visualization, and you can easily visualize the model using pygraphvis etc. Since I was using jupyter this time, I used IPython.display.SVG and

from IPython.display import SVG
from keras.utils.visualize_util import model_to_dot, plot
SVG(model_to_dot(model, show_shapes=True).create(prog='dot', format='svg'))

By writing, the following model diagram will be generated (you need to install pydot with pip, graphviz with homebrew, etc.).

in conclusion

As you can see, Keras allows you to write modeling code very concisely. I will continue to try more complicated models using Keras.

References

--Keras Official (http://keras.io/) --A record of artificial intelligence (http://aidiary.hatenablog.com/entry/20160328/1459174455) --I made RNN learn the sine wave and predicted it (http://qiita.com/yuyakato/items/ab38064ca215e8750865) --Let's have keras visualize the model! !! (http://ket-30.hatenablog.com/entry/keras/graph)