[PYTHON] Kyotei forecast with TensorFlow

Motivation

At an event, I had the opportunity to touch TensorFlow, and at that time I was asked if I could predict a boat race by machine learning, so I tried it because it seemed interesting.

environment

Ubuntu 16.04 + python 2.7.12 + TensorFlow 0.7.1

1. Condition setting

In the boat race, 6 boats compete for the ranking. The person who purchases the boat ticket predicts the order of arrival at the goal based on the player's battle history. This time, we will challenge the prediction of "Nirenmono", which predicts the 1st and 2nd place at the goal including the order of arrival.

2. Create past race database

Past race results are provided as text files at the following sites. http://www1.mbrace.or.jp/od2/K/pindex.html The results from 2014 to the present (2016/11) were acquired in a batch and made into a database with Python + SQLite3.

3. Create input features

I calculated the features to be input when training. The features used are as follows. --Race venue --Whether or not the approach is fixed --Athlete approach distribution --Distribution of order of arrival by player frame --Athlete start timing distribution --Kimarite distribution of players The feature quantities of the players were created for the past one and a half months. In addition, races in which players with extremely little history are participating are excluded from the forecast. It seems that some people use the boat motors used by athletes as a reference for their predictions, but this time they were excluded.

I implemented the network by referring to the following article. [Machine learning (TensorFlow) + Lotto 6] http://qiita.com/yai/items/a128727ffdd334a4bc57

4. Training

The training was conducted for 97200 races from January 2014 to March 2016, and the number of steps was 300. As a result, the hit rate in the training data was about 20%. After all it seems difficult to predict the boat race.

5. Simulation

We tested the race for 6 months from May 2016 to October 2016. In each race, the simulation is performed assuming that the one with the highest output label (= expected result) is bought for 100 yen each. For the convenience of the created program, races with 5 or less boats that have scored goals due to fouls or dropped boats of athletes are excluded from the test cases. In addition, we do not anticipate any decrease in odds due to the purchase of boat tickets. Therefore, please note that the results such as the hit rate shown below may be slightly higher than the actual results.

6. Simulation (1) All race forecast

I will try it in all the expected races during the period.

period Expected number of races Number of hit races Hit rate Income and expenditure(Circle)
2016/5 4178 856 0.204 -63,010
2016/6 3589 723 0.201 -54,460
2016/7 3940 752 0.190 -75,450
2016/8 4336 816 0.188 -61,120
2016/9 3598 672 0.186 -64,610
2016/10 3750 688 0.183 -74,940
Total 23391 4507 -393,590

It's a disappointing result. Since the hit rate is low and only races with low odds are hit, the balance is significantly negative.

7. Simulation (2) Select a race and predict

Only try races where the output label exceeds a certain threshold (0.45 this time). I feel like I'm focusing on the races I'm confident about.

period Expected number of races Number of hit races Hit rate Income and expenditure(Circle)
2016/5 55 28 0.509 +190
2016/6 53 24 0.452 +1,050
2016/7 63 29 0.460 +790
2016/8 47 24 0.510 +530
2016/9 30 13 0.433 -170
2016/10 30 14 0.466 +450
Total 278 132 +2,840

The hit rate is over 40%, and the income and expenditure is subtle but positive in 5 months out of 6 months. After all it seems that only races with low odds are hit, but it seems that it is covered by a high hit rate. Considering that the average recovery rate of boat races is 75%, it seems to be a reasonable result.

Summary

Since a boat race is a person-to-person race, there are many irregular elements, and it seems difficult to predict the finish order result itself by machine learning. One of the reasons is that I am an amateur in machine learning and boat racing in the first place. It may be used to extract so-called "hard races" where there is an overwhelming difference in ability between athletes from a large number of races. As I mentioned before, the simulation I did this time was done under conditions different from reality, and I'm not sure if it will work in the actual race, so I'm not sure.

Recommended Posts

Kyotei forecast with TensorFlow
Stock price forecast with tensorflow
Zundokokiyoshi with TensorFlow
Breakout with Tensorflow
Stock Price Forecast with TensorFlow (LSTM) ~ Stock Forecast Part 1 ~
Reading data with TensorFlow
Try regression with TensorFlow
Stock Price Forecast with TensorFlow (Multilayer Perceptron: MLP) ~ Stock Forecast Part 2 ~
Translate Getting Started With TensorFlow
Try deep learning with TensorFlow
Use TensorFlow with Intellij IDEA
Jetson Nano JETPACK 44.1 (2020/10/21) with Tensorflow
Easy image classification with TensorFlow
Scraping weather forecast with python
Try TensorFlow MNIST with RNN
Ensure reproducibility with tf.keras in Tensorflow 2.3
TensorFlow 2.2 can't be installed with Python 3.8!
MNIST (DCNN) with Keras (TensorFlow backend)
Customize Model / Layer / Metric with TensorFlow
Inference & result display with Tensorflow + matplotlib
Classify "Wine" with TensorFlow MLP code
Precautions when installing tensorflow with anaconda
Bitcoin Price Forecast on TensorFlow (LSTM)
[TensorFlow 2] Learn RNN with CTC Loss
[TensorFlow] [Keras] Neural network construction with Keras
Use Tensorflow 2.1.0 with Anaconda on Windows 10!
Try data parallelism with Distributed TensorFlow
Zura predicting today's temperature with TensorFlow
Intellisense doesn't work with tensorflow2.0 + VScode
Achieve pytorch reflection padding with Tensorflow
Tweet the weather forecast with a bot
I tried to implement Autoencoder with TensorFlow
Learn data distributed with TensorFlow Y = 2X
Put TensorFlow in P2 instance with pip3
Compare raw TensorFlow with tf.contrib.learn and Keras
Stock price forecast using deep learning (TensorFlow)
Nowadays, implement DQN (complete version) with Tensorflow
Numerical calculation of differential equations with TensorFlow 2.0
Try TensorFlow RNN with a basic model