[PYTHON] Preparing to learn technical indicators with TFlearn

Introduction

There are many types of technical indicators used in market analysis, but basically, the past four values are used as input data, and some calculation is performed and output.

So, when you know the input and output data, can you estimate what kind of calculation you are doing? If you know the calculation model and just decide the parameters that appear in it, there are various methods such as system identification and optimization algorithm.

However, it would be a little difficult to estimate even the calculation model, but if the artificial intelligence and deep learning that are popular now are "universal", it will be possible. No, if you can't do that much, you shouldn't call yourself artificial intelligence.

Aside from that, this time I tried with TFlearn to see if I could learn the technical indicators that I knew in advance.

Technical indicators to learn

The most basic technical indicator is the moving average. It's a simple linear process, so you don't have to use a neural network, but it's not too simple. I will try from the really simple one.

I can't get the average with one input, so I'll use two for the time being. However, change the weight to 1/3 and 2/3.

y(n)=\frac{1}{3}x(n-1)+\frac{2}{3}x(n)

This is the so-called period 2 linear weighted moving average (LWMA), which is MT5.

iMA(_Symbol, 0, 2, 0, MODE_LWMA, PRICE_CLOSE);

It can be calculated using the function of. The following file is output together with the daily data of USDJPY in MT5 Program.

``USDJPY.f16408.txt`


Time,Open,High,Low,Close,Ind0
2015.01.02 00:00,120.416,120.74,119.805,120.458,120.257999999996
2015.01.05 00:00,120.531,120.646,119.371,119.631,119.906666666662
2015.01.06 00:00,119.285,119.504,118.05,118.363,118.785666666662
2015.01.07 00:00,118.587,119.647,118.504,119.249,118.953666666662
2015.01.08 00:00,119.314,119.961,119.157,119.652,119.517666666662
2015.01.09 00:00,119.727,119.876,118.415,118.509,118.889999999995
2015.01.12 00:00,118.315,119.315,118.098,118.346,118.400333333328
2015.01.13 00:00,118.363,118.849,117.534,117.925,118.065333333328
2015.01.14 00:00,117.89,117.937,116.067,117.335,117.531666666662
2015.01.15 00:00,117.257,117.941,116.151,116.165,116.554999999995
2015.01.16 00:00,116.183,117.764,115.849,117.541,117.082333333328
2015.01.19 00:00,117.426,117.78,116.919,117.557,117.551666666661
2015.01.20 00:00,117.654,118.866,117.64,118.766,118.362999999995
2015.01.21 00:00,118.67,118.759,117.179,117.956,118.225999999994
2015.01.22 00:00,117.917,118.665,117.245,118.469,118.297999999994
2015.01.23 00:00,118.633,118.813,117.534,117.754,117.992333333328
2015.01.26 00:00,117.634,118.497,117.263,118.447,118.215999999994
2015.01.27 00:00,118.413,118.657,117.334,117.861,118.056333333327
2015.01.28 00:00,117.746,118.262,117.249,117.536,117.644333333327
2015.01.29 00:00,117.489,118.486,117.385,118.257,118.016666666661
2015.01.30 00:00,118.336,118.459,117.296,117.542,117.780333333327
  :

Creation of teacher data

Read the above file and create teacher data. The file contains 4 values and index values, but only the closing price Close and the index value ʻInd0` are used here.

Create about 200 samples of input data (2 in this case) in X and index output value in Y.

import numpy as np
import pandas as pd
import tensorflow as tf
import tflearn

file = 'USDJPY.f16408.txt'
ohlc = pd.read_csv(file, index_col='Time', parse_dates=True)
close = ohlc.Close.values
ind0 = ohlc.Ind0.values

N = 2
X = np.empty((0,N))
Y = np.empty((0,1))
for i in range(200):
    X = np.vstack((X, close[i:i+N]))
    Y = np.vstack((Y, ind0[i+N-1:i+N]))

Graph definition

This is the definition of the graph used in TFlearn, but this time we will define the one that is linearly combined from the input layer, assuming that it is a linear model. Also, I don't use bias, so I'll set bias = False.

However, for regression, the default results were not very good, so I adjusted the learning rate to gradually decrease using the SGD method.

# Graph definition
layer_in = tflearn.input_data(shape=[None, N])
layer1 = tflearn.fully_connected(layer_in, 1, activation='linear', bias=False)
sgd = tflearn.optimizers.SGD(learning_rate=0.01, lr_decay=0.95, decay_step=100)
regression = tflearn.regression(layer1, optimizer=sgd, loss='mean_square')

Learning

Learning is easy to write with TFlearn. Learn up to 10000 generations.

# Model training
m = tflearn.DNN(regression)
m.fit(X, Y, n_epoch=10000, snapshot_epoch=False, run_id='MAlearn')

result

The weighting coefficient is output as the learning result.

# Weights
print('\nweights')
for i in range(N):
    print('W['+str(i)+'] =' ,m.get_weights(layer1.W)[i])

The following is the output of the learning result.

---------------------------------
Run id: MAlearn
Log directory: /tmp/tflearn_logs/
---------------------------------
Training samples: 200
Validation samples: 0
--
Training Step: 40000  | total loss: 0.61673
| SGD | epoch: 10000 | loss: 0.61673 -- iter: 200/200
--

weights
W[0] = [ 0.43885291]
W[1] = [ 0.56114811]

The exact weights are W [0] = 0.3333, W [1] = 0.6777, so it doesn't look like you're learning properly.

Processing of teacher data

In this example, the model was just to find the linear sum of two inputs, but if the difference between the two input data was small, the error would be small, and learning was not successful. Therefore, this time, I tried to use only the teacher data when the difference between the two data is 1 or more.

N = 2
X = np.empty((0,N))
Y = np.empty((0,1))
for i in range(200):
    if abs(close[i]-close[i+1]) >= 1.0:
        X = np.vstack((X, close[i:i+N]))
        Y = np.vstack((Y, ind0[i+N-1:i+N]))

The result.

---------------------------------
Run id: MAlearn
Log directory: /tmp/tflearn_logs/
---------------------------------
Training samples: 22
Validation samples: 0
--
Training Step: 10000  | total loss: 1.94699
| SGD | epoch: 10000 | loss: 1.94699 -- iter: 22/22
--

weights
W[0] = [ 0.33961287]
W[1] = [ 0.66053367]

The number of samples has decreased to 22, but the weight is closer to the true value than before.

Noisy teacher data

In the first place, neural networks are not for accurate numerical prediction, but for making fairly about predictions.

Therefore, this time, I added Gaussian noise with an average of 0 and a standard deviation of 0.1 to the index value as teacher data.

N = 2
X = np.empty((0,N))
Y = np.empty((0,1))
for i in range(200):
    if abs(close[i]-close[i+1]) >= 1.0:
        X = np.vstack((X, close[i:i+N]))
        noise = np.random.normal(0,0.1)
        Y = np.vstack((Y, ind0[i+N-1:i+N]+noise))

The result.

---------------------------------
Run id: MAlearn
Log directory: /tmp/tflearn_logs/
---------------------------------
Training samples: 22
Validation samples: 0
--
Training Step: 10000  | total loss: 3.79990
| SGD | epoch: 10000 | loss: 3.79990 -- iter: 22/22
--

weights
W[0] = [ 0.32918188]
W[1] = [ 0.67114329]

The results will change slightly each time, but you will find that you are generally learning. The merit of neural networks is that they can somehow predict even if noise is introduced.

Summary

It is possible to handle 3 or more data by changing the N in the code, but in fact, the weighting coefficient could not be obtained with very good accuracy when there were 3 or more inputs. The result is omitted.

So, it seems that there is still a long way to go to learn any technical index. For the time being, it was an article about preparation.

Well, the advantage of using TFlearn is that you can quickly verify whether your idea is good or not.

Recommended Posts

Preparing to learn technical indicators with TFlearn
OpenAI Gym to learn with PD-controlled Cart Pole
I tried to learn logical operations with TF Learn
Learn to recognize handwritten numbers (MNIST) with Caffe
AWS Step Functions to learn with a sample
How to Learn Kaldi with the JUST Corpus
I tried to learn the sin function with chainer
I tried to implement and learn DCGAN with PyTorch
Preparing to use Tensorflow (Anaconda) with Visual Studio Code
Site summary to learn machine learning with English video
How to learn structured SVM of ChainCRF with PyStruct
[How to!] Learn and play Super Mario with Tensorflow !!
MVC --Model edition to learn from 0 with prejudice only
Learn Python with ChemTHEATER
Convert 202003 to 2020-03 with pandas
Learn Zundokokiyoshi with LSTM
Learn Pandas with Cheminformatics
Learn with chemoinformatics scikit-learn
Learn with Cheminformatics Matplotlib
Learn with Cheminformatics NumPy
DCGAN with TF Learn
Learn Pendulum-v0 with DDPG
Steps to learn & infer transformer English-Japanese translation model with CloudTPU
Machine learning to learn with Nogizaka46 and Keyakizaka46 Part 1 Introduction