[PYTHON] Is it possible to eat by forecasting stock prices by machine learning [Machine learning part 1]

Introduction

In the previous post , the price range of Nikkei 225 with 60% accuracy If I could predict it, I would make a joke that I would get an annual profit of 48%.

Therefore, we collect information with Qiita.

First of all, pay attention to the following articles. https://qiita.com/akiraak/items/b27a5616a94cd64a8653

Accuracy = 72%. I will retest this.

The technology used is TensorFlow. Kudan's article explains in detail the code using TensorFlow, and the source code is also open to the public. My understanding is sloppy, but you can touch it and give it a try. However, since it was an article in 2017, it was difficult to move it because it changed a lot.

Original article scheme

Until the day before, machine learning was performed with Tensorflow using the price movements of each market (FTSE, GDAXI, HSI, N225, SSEC, etc.) for the past 3 days as explanatory variables and the SP500 discount price of the day as teacher data. We have built a predictor that predicts the sign and evaluated its performance. However, I don't understand the detailed principle because I'm crazy.

The result is Accuracy = 70% level.

This is promising.

This scheme

The forecast target is 1358 Nikkei lever doubled, and the explanatory variables are also slightly revised.

The explanatory variables include not only the NIKKEI225 up to the previous day and the indexes of each country, but also the yen / dollar.

Teacher label: 1358 Nikkei lever on the day Double price range (close price-close price) rises or falls

Explanatory variable: DOW30, NASDAQ_COMP, S & P500, FTSE_MIB, DAX, CAC40, HANG_SENG, usdjpy, NIKKEI225 Closing prices for the past 3 days of each index (9 x 3 = 27 dimensions)

The model remains the original source

 Input layer: [27, 50], stddev = 0.0001
 Hidden layer 1: [50, 25], stddev = 0.0001
 Hidden layer 2: [25, 2], stddev = 0.0001
 Output layer: [2]

Experimental result

Study period = 2014-10-30 ~ 2017-10-31

Evaluation period = 2017-11-22 ~ 2020-07-09

Number of learning: 12000 times

Transition of accuracy by learning data

1000 0.5241935
2000 0.55241936
3000 0.5510753
4000 0.5591398
5000 0.5645161
6000 0.56989247
7000 0.57123655
8000 0.56989247
9000 0.56182796
10000 0.5577957
11000 0.5591398
12000 0.56182796

Accuracy during the evaluation period

 Rating = 673
 Rise, correct answer = 127
 Fall, correct answer = 216
 Rise, incorrect answer = 130
 Fall, incorrect answer = 200
Accuracy =  0.5096582466567607

Profit and loss simulation (back test)

In case of increase forecast: 1358 listed Nikkei double In case of decline: 1357 Nikkei Double Inverse Buy 10,000,000 yen with a close-up and sell at the close Profit and loss when this method is repeated from 2017-11-22 to 2020-07-09 -1,551,671 yen fig2-1_result.png

Monthly profit and loss

 November 2017: -83249 yen
 December 2017: -83258 yen
 January 2018: 341621 yen
 February 2018: -595994 yen
 March 2018: 626082 yen
 April 2018: -222331 yen
 May 2018: -332350 yen
 June 2018: -384622 yen
 July 2018: -375612 yen
 August 2018: 355649 yen
 September 2018: 472 170 yen
 October 2018: -262912 yen
 November 2018: -251 190 yen
 December 2018: 1321959 yen
 January 2019: 1181772 yen
 February 2019: -545563 yen
 March 2019: -427 964 yen
 April 2019: 82859 yen
 May 2019: -465943 yen
 June 2019: -265 452 yen
 July 2019: 96485 yen
 August 2019: 936343 yen
 September 2019: -723 723 yen
 October 2019: -432 153 yen
 November 2019: 396537 yen
 December 2019: 181781 yen
 January 2020: -303620 yen
 February 2020: -109976 yen
 March 2020: -3523718 yen
 April 2020: -35221 yen
 May 2020: 630419 yen
 June 2020: 1731269 yen

Consideration of results

It seems that learning has been completed, but the accuracy of the evaluation data set aside is at the same level as rolling the dice. Of course, the backtest results are totally useless.

It seems that the profit and loss will be slightly positive if you buy and sell in the exact opposite of the forecast. .. ..

Predicting the closing price of NIKKEI 225 on the day from the index of each country's market up to the previous day was that the result of the ancestors was likely to be Accuracy = 0.7 or more, but there is not much benefit in predicting the closing price. You don't even have to make a prediction, just by looking at the board before you stop by.

Predicting the price range as in this scheme means predicting the close price and the discount price after that, and the fact that this Accuracy is 0.5096 means that such a prediction cannot be made. Naturally, of course. After all, the destination is a random walk.

From 2017/12 to 2020/2, the balance is just like rolling and selling dice. The loss is large from 2020/2 to 2020/3, but it can be said that the prediction was greatly missed by the corona vortex because it used a predictor that learned the calm state before the new corona epidemic. Isn't it?

Looking at the graph of cumulative profit and loss, it seems that the whole period is not random at all, and there are good times (when the forecast is correct) and bad times.

Even if the balance is the same, the quality of the predictor seems to be different between the case where the cumulative profit and loss graph is like A and the case where the cumulative profit and loss graph is like B. A is better than B. The predictor this time is of poor quality!

fig2-2A.png

fig2-3B.png

If I happened to start operation from December 2018 with this predictor, I would be in a state of making a lot of money for the first few months, and I would think "I am a genius!" Become.

Predictor B, which has a large cycle of profit and loss, is a bad predictor, and we conclude this time with the new goal of creating predictor C, which accumulates positives in small steps.

fig2-4C.png

Summary

Continued on " Yahoo! Finance.

Recommended Posts

Is it possible to eat by forecasting stock prices by machine learning [Machine learning part 1]
Is it possible to eat stock price forecasts by machine learning [Implementation plan]
Stock price forecast by machine learning is so true Numerai Signals
Classification of guitar images by machine learning Part 1
Try to forecast power demand by machine learning
Stock price forecast by machine learning Numerai Signals
Classification of guitar images by machine learning Part 2
Machine Learning Amateur Marketers Challenge Kaggle's House Prices (Part 1)
Introduction to machine learning
What is machine learning?
Is it possible to detect similar images only with ImageHash?
Machine learning to learn with Nogizaka46 and Keyakizaka46 Part 1 Introduction
Stock price forecast by machine learning Let's get started Numerai
An introduction to machine learning
Super introduction to machine learning
4 [/] Four Arithmetic by Machine Learning
Python learning memo for machine learning by Chainer Chapter 8 Introduction to Numpy
Python learning memo for machine learning by Chainer Chapter 10 Introduction to Cupy
Python learning memo for machine learning by Chainer Chapter 9 Introduction to scikit-learn
Two weeks after starting machine learning, what it took to start machine learning
Is it possible to enter a venture before listing and make a lot of money with stock options?