[PYTHON] Stock Price Forecast Using Deep Learning (TensorFlow) -Part 2-

Continuation of Last time. It is a story that we will use TensorFlow, which is a deep learning framework, to predict stock prices. By the way, last time it was a complete failure. Last time commented from tawago ["Google is doing the same thing"]( I received the information https://cloud.google.com/solutions/machine-learning-with-financial-time-series-data), so I copied it ... I tried to inspire it.

Differences from the last time

Last time, it was "using the Nikkei 225 for several days to predict whether the Nikkei 225 for the next day will rise, fall, or not change (3 choices)". Google's demo says, "Use a few days' worth of global stock indexes (Dow, Nikkei 225, FTSE100, DAX, etc.) to predict whether the S & P will go up or down the next day (two choices)." It was the content. So, the following are the main changes from the last time. ――Two choices of "go up" and "go down" -Use not only the Nikkei average but also stock indexes of other countries --Hidden layer x2, number of units is 50,25 As with the last time, we expect the Nikkei average for the next day.

For more information on the Google demo, go to this video. (English)

environment

TensorFlow 0.7 Ubuntu 14.04 Python 2.7 AWS EC2 micro instance

Implementation

Preparation

It seems that the data can be downloaded from the site Quandl. However, there was a problem that the data posted was too large to know which one to download, so this time I will use only four, Nikkei, Dow, Hang Seng Index, and Germany. I used about 8 in the Google demo.

The dropped data is put together in one table and saved as CSV. There is no particular script. I did it manually using Excel's VLOOKUP function. If you can, you can put it in the DB and handle it well ...

label

This time there are two choices, "up" or "down", so the correct flag is as follows.

if array_base[idx][3] > array_base[idx+1][3]:
  y_flg_array.append([1., 0.])
  up += 1
else:
  y_flg_array.append([0., 1.])
  down += 1

As a whole sample Increase: 50.6% Down: 49.4% have become.

Graph creation

The graph almost mimics Google's code. The number of hidden layers and the number of units are still in Google code. There may have been no dropouts.

NUM_HIDDEN1 = 50
NUM_HIDDEN2 = 25
def inference(x_ph, keep_prob):

  with tf.name_scope('hidden1'):
    weights = tf.Variable(tf.truncated_normal([data_num * price_num, NUM_HIDDEN1], stddev=stddev), name='weights')
    biases = tf.Variable(tf.zeros([NUM_HIDDEN1]), name='biases')
    hidden1 = tf.nn.relu(tf.matmul(x_ph, weights) + biases)
  
  with tf.name_scope('hidden2'):
    weights = tf.Variable(tf.truncated_normal([NUM_HIDDEN1, NUM_HIDDEN2], stddev=stddev), name='weights')
    biases = tf.Variable(tf.zeros([NUM_HIDDEN2]), name='biases')
    hidden2 = tf.nn.relu(tf.matmul(hidden1, weights) + biases)
  
  #DropOut
  dropout = tf.nn.dropout(hidden2, keep_prob)
  
  with tf.name_scope('softmax'):
    weights = tf.Variable(tf.truncated_normal([NUM_HIDDEN2, 2], stddev=stddev), name='weights')
    biases = tf.Variable(tf.zeros([2]), name='biases')
    y = tf.nn.softmax(tf.matmul(dropout, weights) + biases)
  
  return y

loss

The loss that I forgot to write last time is defined as follows. It's the same as last time, but it's the same as Google.

def loss(y, target):
  return -tf.reduce_sum(target * tf.log(y))

optimisation

Optimization is the same as last time.

def optimize(loss):
  optimizer = tf.train.AdamOptimizer(learning_rate)
  train_step = optimizer.minimize(loss)
  return train_step

Training

The training is the same as last time.

def training(sess, train_step, loss, x_train_array, y_flg_train_array):
  
  summary_op = tf.merge_all_summaries()
  init = tf.initialize_all_variables()
  sess.run(init)
  
  summary_writer = tf.train.SummaryWriter(LOG_DIR, graph_def=sess.graph_def)
  
  for i in range(int(len(x_train_array) / bach_size)):
    batch_xs = getBachArray(x_train_array, i * bach_size, bach_size)
    batch_ys = getBachArray(y_flg_train_array, i * bach_size, bach_size)
    sess.run(train_step, feed_dict={x_ph: batch_xs, y_ph: batch_ys, keep_prob: 0.8})
    ce = sess.run(loss, feed_dict={x_ph: batch_xs, y_ph: batch_ys, keep_prob: 1.0})

    summary_str = sess.run(summary_op, feed_dict={x_ph: batch_xs, y_ph: batch_ys, keep_prob: 1.0})
    summary_writer.add_summary(summary_str, i)

Evaluation

Evaluation is the same as last time.

correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_ph, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print(sess.run(accuracy, feed_dict={x_ph: x_test_array, y_ph: y_flg_test_array, keep_prob: 1.0}))

Result-Part 1-

As a result of the above, the accuracy is

0.50295

** ・ ・ ・ No. ** **

Fix

As an input, I used the stock index of each country as it is, but I corrected it a little. I decided to enter "how much the stock index has fluctuated compared to the previous day". (I'm not sure, but Google seemed to do that in the video) The code looks like this. (It may be difficult to understand)

tmp_array = []
for j in xrange(idx+1, idx + data_num + 1):
  for row, sprice in enumerate(array_base[j]):
    tmp_array.append(sprice)

x_array.append(tmp_array)

Changed the above as below.

tmp_array = []
for j in xrange(idx+1, idx + data_num + 1):
  for k in range(price_num):
    tmp_array.append(((array_base[j][k]) - (array_base[j+1][k])) / array_base[j][k] * 100)

x_array.append(tmp_array)

Result-Part 2-

0.63185

It seems that somehow it became a meaningful number. At 63%, we can predict whether it will go up or down, so it can be said that we are getting better results than guessing. That's why it's a success (^ _ ^;) I think that the accuracy was about 72% in the Google demo, but I think that this is because the number of stock indexes used here is small.

Consideration

--It may be better to input "a group of numbers that is meaningful as a whole" rather than "a group of independent numbers". ――The stddev specification of tf.truncated_normal (), which I was wondering if it doesn't matter last time, is relatively important. If you keep the default, it is easy to diverge, so 0.001 is specified. If it seems to be divergent, it seems that it is a good parameter to adjust. --The number of hidden layer units specified by Google seems to be quite appropriate. This time, it's 50 and 25, but should I look at the initial number of units about twice the input? (I don `t really understand) ――As with Google's demo, if you include more stock indexes or data such as exchange rates, the accuracy may increase.

Impressions

――I'm glad it looks like that for the time being! ――TensorFlow has many parts that you don't have to change, such as training parts, even if the data changes a little, so once you create something like a template, coding is easy. If anything, processing the input data is more troublesome. ――For the above reasons, it is easier to keep the numbers constant as much as possible.

by the way···

If anyone knows a site where you can download currency exchange data for 5, 15 and 30 minutes, please let me know. m (__) m

Recommended Posts

Stock Price Forecast Using Deep Learning (TensorFlow) -Part 2-
Stock price forecast using deep learning (TensorFlow)
Stock price forecast using deep learning [Data acquisition]
Stock price forecast using machine learning (scikit-learn)
Stock Price Forecast with TensorFlow (LSTM) ~ Stock Forecast Part 1 ~
Stock price forecast using machine learning (regression)
Stock price forecast with tensorflow
Python: Stock Price Forecast Part 2
Python: Stock Price Forecast Part 1
Try deep learning with TensorFlow Part 2
Stock Price Forecast 2 Chapter 2
Stock price forecast by machine learning Numerai Signals
Python & Machine Learning Study Memo ⑦: Stock Price Forecast
Stock Price Forecast 1 Chapter 1
Try deep learning with TensorFlow
Stock Price Forecasting Using LSTM_1
[Part 4] Use Deep Learning to forecast the weather from weather images
[Part 1] Use Deep Learning to forecast the weather from weather images
[Part 2] Use Deep Learning to forecast the weather from weather images
Stock price forecast by machine learning Let's get started Numerai
Bitcoin Price Forecast on TensorFlow (LSTM)
[Python] My stock price forecast [HFT]
I tried deep learning using Theano
Predicting stock price changes using metal labeling and two-step machine learning
Stock price forecast by machine learning is so true Numerai Signals
Dealing with tensorflow suddenly stopped working using GPU in deep learning
Image recognition model using deep learning in 2016
Try Bitcoin Price Forecasting with Deep Learning
Python: Gender Identification (Deep Learning Development) Part 1
Python: Gender Identification (Deep Learning Development) Part 2
Until the Deep Learning environment (TensorFlow) using GPU is prepared for Ubuntu 14.04
Deep Learning
"Deep Learning from scratch" Self-study memo (Part 12) Deep learning
A story about simple machine learning using TensorFlow
An amateur tried Deep Learning using Caffe (Introduction)
An amateur tried Deep Learning using Caffe (Practice)
[Machine learning] Supervised learning using kernel density estimation Part 2
[Machine learning] Supervised learning using kernel density estimation Part 3
Video frame interpolation by deep learning Part1 [Python]
[Causal search / causal reasoning] Execute causal search (SAM) using deep learning
An amateur tried Deep Learning using Caffe (Overview)
Report_Deep Learning (Part 1)
Report_Deep Learning (Part 1)
Deep Learning Memorandum
Start Deep learning
Report_Deep Learning (Part 2)
Python Deep Learning
Deep learning × Python
Thinking about party attack-like growth tactics using deep learning
Stock investment by deep reinforcement learning (policy gradient method) (1)
Deep Learning from the mathematical basics Part 2 (during attendance)
DNN (Deep Learning) Library: Comparison of chainer and TensorFlow (1)
Collection and automation of erotic images using deep learning