[PYTHON] Stock Price Forecast with TensorFlow (LSTM) ~ Stock Forecast Part 1 ~


I will start studying stock investment, so I will leave a note of it.

Predict stock prices using goals, machine learning and deep learning.

Before you start studying, first check the following books.


Why stock?

I chose the stock from the following views.

◆ Gambling such as horse racing
The return of 0 or 100 is also large, but the risk is large.
 ◆ FX 
On the other hand, there are people who make money, but there are people who lose it, so it doesn't suit their gender.
 ◆ bitcoin
Since the value has not been established, there is a possibility of a crash.
◆ Stocks
As for stocks, everyone is profitable.

First of all, stock prediction experiment

Let's start the experiment from the site where you can download and experiment before bringing it by scraping.

The downloaded data is the Nikkei 225 2007-2017 information. Contains data for date, open, high, low and close prices.

The data used this time uses the closing price.

Think about the approach

For stocks, it is considered that it is better to predict by RNN (Recurrent Neural Network) (* 2) using time series than to statistically analyze past performance (* 1), so it is an extension of RNN. Let's try using LSTM (Long short-term memory).

screenshot_119.png Schematic diagram of RNN (source)

Experiment with the program anyway

First of all, I gave priority to trying it and tried to make a program quickly with the Nikkei average.

# -*- coding: utf-8 -*-
import numpy
import pandas
import matplotlib.pyplot as plt

from sklearn import preprocessing
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.layers.recurrent import LSTM

class Prediction :

  def __init__(self):
    self.length_of_sequences = 10
    self.in_out_neurons = 1
    self.hidden_neurons = 300

  def load_data(self, data, n_prev=10):
    X, Y = [], []
    for i in range(len(data) - n_prev):
    retX = numpy.array(X)
    retY = numpy.array(Y)
    return retX, retY

  def create_model(self) :
    model = Sequential()
    model.add(LSTM(self.hidden_neurons, \
              batch_input_shape=(None, self.length_of_sequences, self.in_out_neurons), \
    model.compile(loss="mape", optimizer="adam")
    return model

  def train(self, X_train, y_train) :
    model = self.create_model()
    model.fit(X_train, y_train, batch_size=10, nb_epoch=100)
    return model

if __name__ == "__main__":

  prediction = Prediction()

  #Data preparation
  data = None
  for year in range(2007, 2017):
    data_ = pandas.read_csv('csv/indices_I101_1d_' + str(year) +  '.csv')
    data = data_ if (data is None) else pandas.concat([data, data_])
  data.columns = ['date', 'open', 'high', 'low', 'close']
  data['date'] = pandas.to_datetime(data['date'], format='%Y-%m-%d')
  #Standardize closing price data
  data['close'] = preprocessing.scale(data['close'])
  data = data.sort_values(by='date')
  data = data.reset_index(drop=True)
  data = data.loc[:, ['date', 'close']]

  #20% to test data
  split_pos = int(len(data) * 0.8)
  x_train, y_train = prediction.load_data(data[['close']].iloc[0:split_pos], prediction.length_of_sequences)
  x_test,  y_test  = prediction.load_data(data[['close']].iloc[split_pos:], prediction.length_of_sequences)

  model = prediction.train(x_train, y_train)

  predicted = model.predict(x_test)
  result = pandas.DataFrame(predicted)
  result.columns = ['predict']
  result['actual'] = y_test


It's pretty predictable ... Screen Shot 0029-06-30 at 3.26.00 AM.png

It seems that you can make good predictions by setting it to UP / DOWN.

at the end

