Keras is a deep learning wrapper library based on Theano and TensorFlow. Thanks to Theano and TensorFlow, it has become much easier to get into deep learning, but it is still difficult to write algorithms. So, Keras is a library that makes it possible to write a network structure quite simply. For an overview of Keras itself, id: aidiary's article was very helpful.
As a basic sample of Keras, I see a lot of MNIST classifications, but I couldn't find many simple samples using RNNs (Keras officially Sample movie emotion classification using RNN, but it was too complicated to handle at first). Therefore, this time, I will try Keras's RNN implementation through a simple sample of letting LSTM learn a sine wave and predict it. For RNN implementation using TensorFlow, I wrote an article on here before, so if you are interested, please see that.
This time, we are predicting sine waves with an emphasis on comprehensibility, but since we wanted to handle more complicated time series data, we used [Python] QRNN to predict chaos time series data [Keras]](https://qiita.com). I created an article called / yukiB / items / 681f68690ffabbf3e1e1).
Keras can use both Theano and Tensorflow as backends, and programs written in Keras can be switched backends at any time without any modification (There seems to be some caveats. / nzw0301 / items / 2823243090b997aa00e5)), but this time I will try the method using TensorFlow as the back end.
It is assumed that TensorFlow is installed in advance.
Install Keras normally using pip
pip install keras
But it ’s okay,
After git clone the source
python setup.py install
But it's fine.
When using TensorFlow as the backend, rewrite the configuration file ~ / .keras / keras.json
as follows (see keras documentation. Keras.json. Is generated by first starting (importing, etc.) keras).
# before
{
"image_dim_ordering": "th",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "theano"
}
# after
{
"image_dim_ordering": "th",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow"
}
The default backend is responsible for Theano, but with the above rewrite,
import keras
When you do
Using TensorFlow backend.
Should be displayed.
First, create the data. The data was created by yuyakato's I tried to predict by letting RNN learn the sine wave I was allowed to refer to.
import pandas as pd
import numpy as np
import math
import random
%matplotlib inline
random.seed(0)
#Random number coefficient
random_factor = 0.05
#Number of steps per cycle
steps_per_cycle = 80
#Number of cycles to generate
number_of_cycles = 50
df = pd.DataFrame(np.arange(steps_per_cycle * number_of_cycles + 1), columns=["t"])
df["sin_t"] = df.t.apply(lambda x: math.sin(x * (2 * math.pi / steps_per_cycle)+ random.uniform(-1.0, +1.0) * random_factor))
df[["sin_t"]].head(steps_per_cycle * 2).plot()
Create a sine wave with noise as shown below.
Next, classify this into training data and test data, and create a data set so that the output y when there is an input X for 100 steps is the 101st step.
def _load_data(data, n_prev = 100):
"""
data should be pd.DataFrame()
"""
docX, docY = [], []
for i in range(len(data)-n_prev):
docX.append(data.iloc[i:i+n_prev].as_matrix())
docY.append(data.iloc[i+n_prev].as_matrix())
alsX = np.array(docX)
alsY = np.array(docY)
return alsX, alsY
def train_test_split(df, test_size=0.1, n_prev = 100):
"""
This just splits data to training and testing parts
"""
ntrn = round(len(df) * (1 - test_size))
ntrn = int(ntrn)
X_train, y_train = _load_data(df.iloc[0:ntrn], n_prev)
X_test, y_test = _load_data(df.iloc[ntrn:], n_prev)
return (X_train, y_train), (X_test, y_test)
length_of_sequences = 100
(X_train, y_train), (X_test, y_test) = train_test_split(df[["sin_t"]], n_prev =length_of_sequences)
Now that the dataset is complete, it's time to write the network configuration using Keras.
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.layers.recurrent import LSTM
in_out_neurons = 1
hidden_neurons = 300
model = Sequential()
model.add(LSTM(hidden_neurons, batch_input_shape=(None, length_of_sequences, in_out_neurons), return_sequences=False))
model.add(Dense(in_out_neurons))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
model.fit(X_train, y_train, batch_size=600, nb_epoch=15, validation_split=0.05)
Only this. As mentioned above, the structure of the neural network can be constructed by adding various layers to `model``. In the above example, an input with a tensor of (, 100, 1) is thrown into 300 LSTM intermediate layers, aggregated into one output layer, and multiplied by a linear activation function.
By the way, LSTM is a 3D Shape with input Tensor (batch_size, input_length, in_data_length). The output is
This is the Shape.
When compiling the model, specify the error function (mean square error in the example) and the optimization algorithm (RMSprop in the example). Of course, cross entropy can be used for the error function, and the optimization algorithms are complete from basic SGD to Adam and RMSprop.
Training is done with fit ()
, and you can specify what percentage of the training data to use as training data and teacher data, batch size, epoch size, and validation data.
Also,
# early stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=2)
model.fit(X_train, y_train, batch_size=600, nb_epoch=15, validation_split=0.05, callbacks=[early_stopping])
By specifying the convergence test callback as in, the loop can be stopped automatically when it converges.
Train on 3325 samples, validate on 176 samples
Epoch 1/15
3325/3325 [==============================] - 17s - loss: 0.0051 - val_loss: 0.0048
Epoch 2/15
1200/3325 [=========>....................] - ETA: 10s - loss: 0.0041
When you start learning, the bar will show you the progress of learning such as prediction of learning time, time taken for learning each epoch, loss / accuracy of training data, loss / accuracy of validation data (as described above). Convenient!).
Forecasting using training data
predicted = model.predict(X_test)
Like, use predict ()
.
In this example,
dataf = pd.DataFrame(predicted[:200])
dataf.columns = ["predict"]
dataf["input"] = y_test[:200]
dataf.plot(figsize=(15, 5))
The prediction result is as follows.
Keras also supports model visualization, and you can easily visualize the model using pygraphvis etc. Since I was using jupyter this time, I used IPython.display.SVG
and
from IPython.display import SVG
from keras.utils.visualize_util import model_to_dot, plot
SVG(model_to_dot(model, show_shapes=True).create(prog='dot', format='svg'))
By writing, the following model diagram will be generated (you need to install pydot with pip, graphviz with homebrew, etc.).
As you can see, Keras allows you to write modeling code very concisely. I will continue to try more complicated models using Keras.
--Keras Official (http://keras.io/) --A record of artificial intelligence (http://aidiary.hatenablog.com/entry/20160328/1459174455) --I made RNN learn the sine wave and predicted it (http://qiita.com/yuyakato/items/ab38064ca215e8750865) --Let's have keras visualize the model! !! (http://ket-30.hatenablog.com/entry/keras/graph)
Recommended Posts