[PYTHON] Try to predict FX with LSTM using Keras + Tensorflow Part 3 (Try brute force parameters)

Introduction

Try to predict FX with LSTM using Keras + Tensorflow Part 2 (Calculate with GPU) I wrote that I will finally get off to a start I did. The reason is that there are many parameters used in deep learning and Forex, but I thought that it would take a considerable amount of time to find important or correct values among them.

That's right, you can't find it without using the GPU.

So this time I can use GPU, so I will try to find a good value by brute force various parameters that I have longed for.

Source

The source can be found at https://github.com/rakichiki/keras_fx. Or do a git clone.

git clone https://github.com/rakichiki/keras_fx.git 

The source this time is keras_fx_gpu_multi.ipynb. Get this and upload it to jupyter.

Let me explain a little.

Parameter brute force

First, I decided on the parameters I wanted to change. It is as follows. (If you look closely, it's not all ...)

Brute force


l_of_s_list                  = [20,25]
n_next_list                  = [5,7]
check_treshhold_list         = [0.50,0.60]
#activation_list              = ['sigmoid','tanh','linear']
activation_list              = ['tanh']
#loss_func_list               = ['mean_squared_error','mean_absolute_error','mean_squared_logarithmic_error']
loss_func_list               = ['mean_squared_error','mean_absolute_error']
#optimizer_func_list          = ['sgd','adadelta','adam','adamax']
optimizer_func_list          = ['adadelta','adam','adamax']
#validation_split_number_list = [0.1,0.05]
validation_split_number_list = [0.05]

currency_pair_list   = ['usdjpy']

#Storage of result files
if os.path.exists('result') == False:
    os.mkdir('result')
if os.path.exists('png') == False:
    os.mkdir('png')

save_file_name = 'result/result_' + dt.today().strftime("%Y%m%d%H%M%S") + '.txt'
save_file_name = dt.today().strftime("%Y%m%d%H%M%S")

#fx data acquisition
start_day     = "20010101"
end_day       =  dt.today().strftime("%Y%m%d")

for currency_pair in currency_pair_list:
    (train_start_count, train_end_count,test_start_count, test_end_count,data) = \
        get_date(start_day, end_day, currency_pair)
    file_name = currency_pair + '_d.csv'

    for l_of_s in l_of_s_list:
        for n_next in n_next_list:
            for check_treshhold in check_treshhold_list:
                #
                (chane_data,average_value,diff_value, up_down,check_percent) = \
                    get_data(l_of_s, n_next,check_treshhold, file_name,train_start_count,\
                             train_end_count,test_start_count, test_end_count,data)
                
                #
                for activation in activation_list:
                    for loss_func in loss_func_list:
                        for optimizer_func in optimizer_func_list:
                            for validation_split_number in validation_split_number_list:
                                print('--------------------------')
                                fit_starttime = time.time()
                                fit(l_of_s, n_next,check_treshhold,file_name,save_file_name,activation,loss_func,optimizer_func,\
                                    validation_split_number,train_start_count, train_end_count,test_start_count, test_end_count,\
                                    chane_data,average_value,diff_value,up_down,check_percent)
                                print(str(math.floor(time.time() - fit_starttime)) + "s")
                                print('')

I would like to say that I would like to brute force these in the range of expectations, but since the time will increase exponentially, it is better to narrow down to some extent and investigate little by little. Well, even if you can use the GPU, if the speed is 10 times faster and the amount of calculation is 1000 times faster, it will be the original tree Ami. (But it is no longer within the range that can be calculated by the CPU)

Also, let's go little by little instead of turning around from the beginning due to the problem described later (I am the one who failed by turning a lot at the beginning).

Introduction of Early Stopping

The amount of calculation has exploded due to brute force parameters, but the improvement of GPU introduction alone is not enough. Therefore, we will introduce Early Stopping to prevent unnecessary looping by epochs.

EarlyStopping


early_stopping = EarlyStopping(monitor='val_loss', patience=10, verbose=1)
~~
high_history = high_model.fit(X_high_train, y_high_train, batch_size=100, epochs=300, \
                   validation_split=validation_split_number, callbacks=[early_stopping])

I think Keras is easy around here. However, it is unclear if this is the condition for Early Stopping.

I want to see the learning curve

Of course, you can't tell if the parameters are correct without looking at the learning curve. Introducing it is not too difficult.

Just keep the return value of fit and graph it.

Learning curve



    #Learning
    high_history = high_model.fit(X_high_train, y_high_train, batch_size=100, epochs=300, \
                   validation_split=validation_split_number, callbacks=[early_stopping])

    ~~~~

    # high
    val_loss = high_history.history['val_loss']
    plt.rc('font',family='serif')
    fig = plt.figure()
    plt.plot(range(len(high_history.history['val_loss'])), val_loss, label='val_loss', color='black')
    plt.xlabel('epochs')
    plt.savefig('png/' + save_file_name + '_high_' + \
                str(l_of_s) + '_' + str(n_next) + \
                '_' + str(check_treshhold) + '_' + file_name + \
                '_' + activation + '_' + loss_func + \
                '_' + optimizer_func + '_' + str(validation_split_number) + \
                '.png')
    plt.show()

As a caveat, if you want to leave a graph, do plt.show () after plt.savefig. The reason is unknown, but if it is the other way around, it will not remain (I referred to the answer in the question corner somewhere).

When it is good, the graph with the transition of val_loss is displayed as shown below.

loss.png

Well, it's another matter whether the hit rate is good just because this is beautiful. However, you can see whether learning is possible or not in this graph.

Save the result to a file

It is expected that it will take a very long time, but the PC may shut down on the way. I , I'm not the type who wants to let a PC without ECC memory work for over 10 hours and keep praying that it will not fall on the way.

So, save the analysis result to a file and take measures even if the PC goes down in the middle (although I will give up in case of storage failure).

File output


    f = open(save_file_name, 'a')
    f.write('l_of_s: ' + str(l_of_s) + ' n_next: ' + str(n_next) + \
            ' check_treshhold:' + str(check_treshhold) + ' file_name:' + file_name + \
            ' activation:' + activation + ' loss_func:' + loss_func + \
            ' optimizer_func:' + optimizer_func + ' validation_split_number:' + str(validation_split_number) + \
            '\n')
    f.write('UP: ' + str(up_ok_count) + ' - ' + str(up_ng_count) + ' - ' + str(up_ev_count) + '\n')
    f.write('DN: ' + str(down_ok_count) + ' - ' + str(down_ng_count) + ' - ' + str(down_ev_count) + '\n')
    f.close()

Was the csv format better? No, was it better to use JSON format (I like JSON format)? However, I will output the progress for the time being. Ah, the JSON format is useless considering that it will fail on the way.

You may want to save the graph as well, as mentioned above.

Result (not good ...)

I turned it a lot for the time being. However, for the reason described later, it is a little gentle (don't say that there is only one pattern such as activation function).

The only currency pair is usdjpy. The result is as follows (over the number of days of trading judgment is not included in the hit rate).

Days for trading judgment Days after buying and selling Change rate for trading judgment Activation function Objective function Optimization algorithm Percentage of training data(%) Number of hits when going up Number of misses when going up Number of hits when it goes down Number of deviations when lowered Total hit rate(%)
20 5 0.5 tanh mse adadelta 0.05 55 34 114 81 59.5
20 5 0.5 tanh mse adam 0.05 24 22 66 46 57.0
20 5 0.5 tanh mse adamax 0.05 14 14 46 33 56.1
20 5 0.5 tanh mae adadelta 0.05 69 58 95 88 52.9
20 5 0.5 tanh mae adam 0.05 31 28 69 58 53.8
20 5 0.5 tanh mae adamax 0.05 29 26 84 69 54.3
20 5 0.6 tanh mse adadelta 0.05 72 53 129 98 57.1
20 5 0.6 tanh mse adam 0.05 64 52 111 97 54.0
20 5 0.6 tanh mse adamax 0.05 43 33 59 52 54.5
20 5 0.6 tanh mae adadelta 0.05 51 40 140 120 54.4
20 5 0.6 tanh mae adam 0.05 75 57 102 75 57.3
20 5 0.6 tanh mae adamax 0.05 45 39 107 93 53.5
20 7 0.5 tanh mse adadelta 0.05 11 12 84 81 50.5
20 7 0.5 tanh mse adam 0.05 7 5 45 35 56.5
20 7 0.5 tanh mse adamax 0.05 22 18 61 40 58.9
20 7 0.5 tanh mae adadelta 0.05 46 37 92 81 53.9
20 7 0.5 tanh mae adam 0.05 25 28 47 31 55.0
20 7 0.5 tanh mae adamax 0.05 20 28 75 62 51.4
20 7 0.6 tanh mse adadelta 0.05 23 16 39 39 53.0
20 7 0.6 tanh mse adam 0.05 24 21 77 67 53.4
20 7 0.6 tanh mse adamax 0.05 27 26 61 45 55.3
20 7 0.6 tanh mae adadelta 0.05 56 43 120 107 54.0
20 7 0.6 tanh mae adam 0.05 40 36 65 58 52.8
20 7 0.6 tanh mae adamax 0.05 49 41 60 54 53.4
25 5 0.5 tanh mse adadelta 0.05 54 32 86 60 60.3
25 5 0.5 tanh mse adam 0.05 25 21 59 41 57.5
25 5 0.5 tanh mse adamax 0.05 15 14 53 39 56.2
25 5 0.5 tanh mae adadelta 0.05 46 37 126 95 56.6
25 5 0.5 tanh mae adam 0.05 34 30 56 41 55.9
25 5 0.5 tanh mae adamax 0.05 25 24 69 47 57.0
25 5 0.6 tanh mse adadelta 0.05 23 21 108 94 53.3
25 5 0.6 tanh mse adam 0.05 19 20 58 51 52.0
25 5 0.6 tanh mse adamax 0.05 18 19 86 69 54.2
25 5 0.6 tanh mae adadelta 0.05 92 80 92 85 52.7
25 5 0.6 tanh mae adam 0.05 26 28 117 100 52.8
25 5 0.6 tanh mae adamax 0.05 32 31 126 102 54.3
25 7 0.5 tanh mse adadelta 0.05 32 18 110 95 55.7
25 7 0.5 tanh mse adam 0.05 16 16 37 19 60.2
25 7 0.5 tanh mse adamax 0.05 9 10 42 28 57.3
25 7 0.5 tanh mae adadelta 0.05 33 23 40 30 57.9
25 7 0.5 tanh mae adam 0.05 25 21 71 55 55.8
25 7 0.5 tanh mae adamax 0.05 36 29 55 38 57.6
25 7 0.6 tanh mse adadelta 0.05 43 35 104 92 53.6
25 7 0.6 tanh mse adam 0.05 23 23 63 58 51.5
25 7 0.6 tanh mse adamax 0.05 25 22 90 70 55.6
25 7 0.6 tanh mae adadelta 0.05 37 25 118 108 53.8
25 7 0.6 tanh mae adam 0.05 33 25 76 63 55.3
25 7 0.6 tanh mae adamax 0.05 40 25 74 59 57.6

The average was 55% at 60% at best and 50% at worst, which was a little better than the dice. By the way, it took about 2 hours to calculate 48 patterns (with Geforce GTX 1070). It is also expected that increasing the parameters will take time exponentially. For this reason, it will be necessary to take measures to speed up somewhere, and since the hit rate is poor, it is necessary to take measures, but before that, a big problem was found.

a problem occured

I was able to find the desired parameters to some extent, but I found a disappointing problem.

It's a memory-intensive problem. Initially I was looking for around 1000 patterns, but an event that became very slow occurred in the middle. After taking the countermeasures, the state of the PC as a result of doing less than 48 patterns is as follows.

ss2.png

The memory of the PC itself consumes 12GB, and the GPU consumes 2GB. Although it is not posted when one pattern is executed, the GPU consumed less than 1GB and the main unit consumed less than 4GB.

Well, if anything, it's a memory leak. Initially, this PC had only 8GB of memory, but in a hurry I replaced the memory (the chassis is Mini-ITX and there are only two memory slots) and increased it to 32GB (I think 16GB was good here). There are opinions, but half-finished investment does not give good results, so I made it 32GB at once).

I don't know why it's not consuming (or freeing) memory so much, but if you want to do more with this script you should take into account the amount of memory and the pattern and time to run it. .. I haven't come up with a workaround so far.

Finally

This series was actually planned so far. In the future, I will do my best to improve the result, which is better than the dice or within the margin of error, but there is no guarantee that the result will be obtained so far.

For this reason, I don't know how far I can go, but I think there are many things I can do. Here's what I'm assuming at this point:

It looks like we can do a lot like this, but to be clear, it's a level that 1 GPU is unlikely to be able to do. In that case, it may be necessary to install multiple GPUs or rent AWS. I plan to think about the next measures while thinking a little about this area.

Link

Recommended Posts

Try to predict FX with LSTM using Keras + Tensorflow Part 3 (Try brute force parameters)
Try to predict FX with LSTM using Keras + Tensorflow Part 2 (Calculate with GPU)
Try to predict forex (FX) with non-deep machine learning
Try deep learning with TensorFlow Part 2
Beginner RNN (LSTM) | Try with Keras
Stock Price Forecast with TensorFlow (LSTM) ~ Stock Forecast Part 1 ~
Video frame prediction using convolution LSTM with TensorFlow
[Python] Try optimizing FX systole parameters with random search
Try to make RESTful API with MVC using Flask 1.0.2
I tried to implement Grad-CAM with keras and tensorflow
Multivariate LSTM with Keras
Try regression with TensorFlow
[TensorFlow 2 / Keras] How to run learning with CTC Loss in Keras
[Python] Try optimizing FX systole parameters with a genetic algorithm
Try to predict if tweets will burn with machine learning
Try to implement linear regression using Pytorch with Google Colaboratory