[PYTHON] Stock price forecast using deep learning (TensorFlow)

For some reason, I don't see people doing so much, so I will try to predict the stock price using deep learning. I'm new to both deep learning and Python. Little is known about libraries, implementation methods, theories, etc. We are waiting for Tsukkomi etc.

Target

――Using stock price data for several days, you can predict whether the Nikkei Stock Average will “rise”, “fall” or “do not change” the next day. (Category)

Overview

――The closing price of the next day is used to judge whether it was "up", "down" or "unchanged". --For input data, use "open price", "high price", "low price", and "close price" from a few days ago to the previous day. --There are four hidden layers. ――As an input, just put in the stock prices for the past few days and train.

environment

TensorFlow 0.7 Ubuntu 14.04 Python 2.7 AWS EC2 micro instance

Contents

Preparation

Prepare as much Nikkei 225 data as possible. This time, I used the data of Yahoo Finance.

Implementation

Keep the Nikkei 225 text. (Because it is troublesome to go to get it every time) This time, I will look at the data for 10 days and predict the stock price for the next day. In addition, if the closing price on the next day is 0.5% or more from the closing price on the previous day, it is "up", if it is 0.5% or less, it is "down", otherwise it is "unchanged". The reason for this number is that these three are just about 33%.

if array_base[idx][3] > (array_base[idx+1][3] * (1.0+flg_range)):
  y_flg_array.append([1., 0., 0.])
  up += 1
elif array_base[idx][3] < (array_base[idx+1][3] * (1.0-flg_range)):
  y_flg_array.append([0., 0., 1.])
  down += 1
else:
  y_flg_array.append([0., 1., 0.])
  keep += 1

The ratio of the data as a whole is as follows. Increased: 33.9% Decreased: 32.7% No change: 33.4%

Graph creation

The code looks like this. It's a little spaghetti with copy and paste, but it's easy to see. (Excuse)

def inference(x_ph, keep_prob):

  with tf.name_scope('hidden1'):
    weights = tf.Variable(tf.truncated_normal([data_num * 4, NUM_HIDDEN1]), name='weights')
    biases = tf.Variable(tf.zeros([NUM_HIDDEN1]), name='biases')
    hidden1 = tf.nn.sigmoid(tf.matmul(x_ph, weights) + biases)
  
  with tf.name_scope('hidden2'):
    weights = tf.Variable(tf.truncated_normal([NUM_HIDDEN1, NUM_HIDDEN2]), name='weights')
    biases = tf.Variable(tf.zeros([NUM_HIDDEN2]), name='biases')
    hidden2 = tf.nn.sigmoid(tf.matmul(hidden1, weights) + biases)
  
  with tf.name_scope('hidden3'):
    weights = tf.Variable(tf.truncated_normal([NUM_HIDDEN2, NUM_HIDDEN3]), name='weights')
    biases = tf.Variable(tf.zeros([NUM_HIDDEN3]), name='biases')
    hidden3 = tf.nn.sigmoid(tf.matmul(hidden2, weights) + biases)
  
  with tf.name_scope('hidden4'):
    weights = tf.Variable(tf.truncated_normal([NUM_HIDDEN3, NUM_HIDDEN4]), name='weights')
    biases = tf.Variable(tf.zeros([NUM_HIDDEN4]), name='biases')
    hidden4 = tf.nn.sigmoid(tf.matmul(hidden3, weights) + biases)
  
  #DropOut
  dropout = tf.nn.dropout(hidden4, keep_prob)
  
  with tf.name_scope('softmax'):
    weights = tf.Variable(tf.truncated_normal([NUM_HIDDEN4, 3]), name='weights')
    biases = tf.Variable(tf.zeros([3]), name='biases')
    y = tf.nn.softmax(tf.matmul(dropout, weights) + biases)
  
  return y

A placeholder where x_ph contains stock price data up to the previous day. It contains data such as [1 day ago closing price, 1 day ago opening price, 1 day ago high price, 1 day ago low price, 2 days ago closing price, ...]. The number of hidden layer units is 100, 50, 30, 10. That is, it is defined below.

# DEFINITION
NUM_HIDDEN1 = 100
NUM_HIDDEN2 = 50
NUM_HIDDEN3 = 30
NUM_HIDDEN4 = 10

By the way, the number of units and the number of layers are appropriate. Isn't it better to have more? I decided with the intention. I would appreciate it if you could tell me any guidelines for determining these numbers m (_ _) m

optimisation

The optimization uses ADAM. I don't know the details, but in my experience, it's harder to diverge than the gradient descent method (GradientDescentOptimizer).

def optimize(loss):
  optimizer = tf.train.AdamOptimizer(learning_rate)
  train_step = optimizer.minimize(loss)
  return train_step

Training

The training looks like the following. The batch size is 100. If it is too big, it will be moss. (Insufficient memory. It's okay if you don't get stingy)

def training(sess, train_step, loss, x_train_array, y_flg_train_array):
  
  summary_op = tf.merge_all_summaries()
  init = tf.initialize_all_variables()
  sess.run(init)
  summary_writer = tf.train.SummaryWriter(LOG_DIR, graph_def=sess.graph_def)
  
  for i in range(int(len(x_train_array) / bach_size)):
    batch_xs = getBachArray(x_train_array, i * bach_size, bach_size)
    batch_ys = getBachArray(y_flg_train_array, i * bach_size, bach_size)
    sess.run(train_step, feed_dict={x_ph: batch_xs, y_ph: batch_ys, keep_prob: 0.8})

    summary_str = sess.run(summary_op, feed_dict={x_ph: batch_xs, y_ph: batch_ys, keep_prob: 1.0})
    summary_writer.add_summary(summary_str, i)

Evaluation

After training, evaluate as follows. It's basically a copy from the tutorial.

correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_ph, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print(sess.run(accuracy, feed_dict={x_ph: x_test_array, y_ph: y_flg_test_array, keep_prob: 1.0}))

result

Looking at the TensorBoard, it looks like it has converged as follows. nikkei.jpg

However, looking at the percentage of correct answers,

0.3375

A terrible defeat. : scream: As mentioned above, the probability of each category (up, down, unchanged) is about 33%, so the result is the same as guessing. (Lol)

Consideration

Regarding the cause of this failure, the possible causes are as follows. --Insufficient model construction? (Please give me some advice) ――In the first place, does the stock price up to the previous day affect the stock price on the next day? ――The Nikkei average is the average of stock prices, so technical factors are not relevant in the first place?

I tried various things by changing the number of days used for input, the number of hidden layers, the number of units, the activation function, etc., but there was no particular change in the results. : sob:

Impressions

I'm not very familiar with matrix calculation or deep learning, but if I can understand the outline, it seems that I can do something like that with TensorFlow! : satisfied:

Referenced sites, pages, literature

Official Tutorial Play to predict race value and type from Pokemon name with TensorFlow [Deep Learning (Machine Learning Professional Series)](http://www.amazon.co.jp/%E6%B7%B1%E5%B1%A4%E5%AD%A6%E7%BF%92-%E6 % A9% 9F% E6% A2% B0% E5% AD% A6% E7% BF% 92% E3% 83% 97% E3% 83% AD% E3% 83% 95% E3% 82% A7% E3% 83 % 83% E3% 82% B7% E3% 83% A7% E3% 83% 8A% E3% 83% AB% E3% 82% B7% E3% 83% AA% E3% 83% BC% E3% 82% BA -% E5% B2% A1% E8% B0% B7-% E8% B2% B4% E4% B9% 8B / dp / 4061529021)

Recommended Posts

Stock price forecast using deep learning (TensorFlow)
Stock Price Forecast Using Deep Learning (TensorFlow) -Part 2-
Stock price forecast using deep learning [Data acquisition]
Stock price forecast using machine learning (scikit-learn)
Stock price forecast using machine learning (regression)
Stock price forecast with tensorflow
Stock Price Forecast with TensorFlow (LSTM) ~ Stock Forecast Part 1 ~
Stock Price Forecast 2 Chapter 2
Stock Price Forecast 1 Chapter 1
Stock price forecast by machine learning Numerai Signals
Python & Machine Learning Study Memo ⑦: Stock Price Forecast
Stock Price Forecast with TensorFlow (Multilayer Perceptron: MLP) ~ Stock Forecast Part 2 ~
Try deep learning with TensorFlow
Python: Stock Price Forecast Part 2
Stock Price Forecasting Using LSTM_1
Python: Stock Price Forecast Part 1
I tried hosting a TensorFlow deep learning model using TensorFlow Serving
Stock price forecast by machine learning Let's get started Numerai
Bitcoin Price Forecast on TensorFlow (LSTM)
[Python] My stock price forecast [HFT]
I tried deep learning using Theano
Predicting stock price changes using metal labeling and two-step machine learning
Stock price forecast by machine learning is so true Numerai Signals
Deep Learning
Dealing with tensorflow suddenly stopped working using GPU in deep learning
Image recognition model using deep learning in 2016
Try Bitcoin Price Forecasting with Deep Learning
Until the Deep Learning environment (TensorFlow) using GPU is prepared for Ubuntu 14.04
A story about simple machine learning using TensorFlow
An amateur tried Deep Learning using Caffe (Introduction)
An amateur tried Deep Learning using Caffe (Practice)
Deep Learning Memorandum
Start Deep learning
Python Deep Learning
[Causal search / causal reasoning] Execute causal search (SAM) using deep learning
An amateur tried Deep Learning using Caffe (Overview)
Deep learning × Python
Stock investment by deep reinforcement learning (policy gradient method) (1)
DNN (Deep Learning) Library: Comparison of chainer and TensorFlow (1)
Collection and automation of erotic images using deep learning
First Deep Learning ~ Struggle ~
Python: Deep Learning Practices
Deep learning / activation functions
Deep Learning from scratch
Deep learning 1 Practice of deep learning
Deep learning / cross entropy
First Deep Learning ~ Preparation ~
First Deep Learning ~ Solution ~
[AI] Deep Metric Learning
I tried deep learning
Machine learning (TensorFlow) + Lotto 6
Cryptocurrency price fluctuation forecast
Python: Deep Learning Tuning
Deep learning large-scale technology
Kaggle ~ House Price Forecast ② ~
Kaggle ~ Home Price Forecast ~
Deep learning / softmax function
[Part 4] Use Deep Learning to forecast the weather from weather images
[Part 1] Use Deep Learning to forecast the weather from weather images
[Part 3] Use Deep Learning to forecast the weather from weather images
Examination of Forecasting Method Using Deep Learning and Wavelet Transform-Part 2-