[PYTHON] I tried to make Othello AI that I learned 7.2 million hands by deep learning with Chainer

It's been a while ago, but it was amazing that AlphaGo won the World Champion three times in a row. Inspired by that, I made an AI for Othello, the same board game this time.

The reason why I chose Othello is that I don't have enough computer resources! !! Because. In the development of AlphaGo, 50 GPUs are doing a tremendous 3 weeks, but it is absolutely impossible for such an individual. So I chose Othello, which has a small board and simple rules.

I mentioned AlphaGo, but the technology used is different from AlphaGo. AlphaGo combines supervised deep learning, deep reinforcement learning, and Monte Carlo tree search, but this time we are only using supervised deep learning. It doesn't perform well, but on the contrary, the algorithm is so simple that even those who don't know the latter two I just heard (I understand deep reinforcement learning, but not Monte Carlo tree search), deep learning. You can understand it only with the knowledge of! In the following explanation, it is assumed that there is some understanding of deep learning (I have read deep learning made from scratch).

Overview

The data of French Othello Association site was used as the musical score data of Othello. Since I used a mysterious format called WTHOR, it was quite difficult to process the data.

The neural network used the musical score state (0: none, 1: own stone, 2: opponent's stone) for the input data, and the probability of each move for the output data. For the teacher data, I used the position of the hand I struck. I think it's difficult to understand with just words, so I'll show you a diagram.

説明1.png

Neural network structure

Like AlphaGo, I used only the convolution layer ** and not the fully connected layer. This is because the output is two-dimensional data, so it is better to leave it as CNN. ** Softmax ** was used for the output layer, and Conv-> BN-> ReLU ** for the other layers. It is recommended to introduce Batch Normalization (BN) because learning will be stable.

Neural network learning

For the loss function, we used the orthodox cross entropy error. I used ** Adam ** as the optimization algorithm. Learning is faster and the final result is better than SGD.

Source code

The long-awaited source code is on github. See the source code for detailed parameters.

Learning results

The performance of AI is ... The rules seem to be mostly understood, but unfortunately they didn't get much stronger. The cause was probably supervised and deep learning only.

However, due to time constraints, I have only studied for about two and a half hours on a PC with only a CPU, so it may still be stronger.

So, ** Looking for someone who has a GPU! ** **

It may become stronger if you study with GPU for about a day.

Recommended Posts

I tried to make Othello AI that I learned 7.2 million hands by deep learning with Chainer
I tried to implement deep learning that is not deep with only NumPy
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Introduction ~
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Implementation ~
I tried to make deep learning scalable with Spark × Keras × Docker 2 Multi-host edition
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Battle Edition ~
I tried to create a reinforcement learning environment for Othello with Open AI gym
I tried to implement ListNet of rank learning with Chainer
I tried to divide with a deep learning language model
765 I tried to identify the three professional families by CNN (with Chainer 2.0.0)
I tried to classify Oba Hana and Emiri Otani by deep learning
I refactored "I tried to make Othello AI when programming beginners studied python"
I tried to predict next year with AI
I tried to make AI for Smash Bros.
I tried to make my own source code compatible with Chainer v2 alpha
[Python] I tried to make a Shiritori AI that enhances vocabulary through battles
I tried to extract a line art from an image with Deep Learning
I tried to implement Cifar10 with SONY Deep Learning library NNabla [Nippon Hurray]
I tried to classify Oba Hana and Emiri Otani by deep learning (Part 2)
[Python] I tried to analyze the characteristics of thumbnails that are easy to play on YouTube by deep learning
I tried deep learning
I tried to move machine learning (ObjectDetection) with TouchDesigner
I tried to make an OCR application with PySimpleGUI
[Deep Learning from scratch] I tried to explain Dropout
I tried to make a real-time sound source separation mock with Python machine learning
"Deep Learning from scratch" Self-study memo (No. 16) I tried to build SimpleConvNet with Keras
I tried to predict horse racing by doing everything from data collection to deep learning
"Deep Learning from scratch" Self-study memo (No. 17) I tried to build DeepConvNet with Keras
I tried to make creative art with AI! I programmed a novelty! (Paper: Creative Adversarial Network)
I tried to make it on / off by setting "Create a plug-in that highlights double-byte space with Sublime Text 2".
[Python] I tried to make an application that calculates salary according to working hours with tkinter
I tried to implement anomaly detection by sparse structure learning
[1 hour challenge] I tried to make a fortune-telling site that is too suitable with Python
I tried to classify MNIST by GNN (with PyTorch geometric)
Mayungo's Python Learning Episode 3: I tried to print numbers with print
I tried to make a generator that generates a C # container class from CSV with Python
Create AI to identify Zuckerberg's face by deep learning ③ (Data learning)
I captured the Touhou Project with Deep Learning ... I wanted to.
I tried to make GUI tic-tac-toe with Python and Tkinter
I tried to implement Perceptron Part 1 [Deep Learning from scratch]
DQN with Chainer. I tried various reinforcement learning in tic-tac-toe. (Deep Q Network, Q-Learning, Monte Carlo)
I tried to make an original language "PPAP Script" that imaged PPAP (Pen Pineapple Appo Pen) with Python
[5th] I tried to make a certain authenticator-like tool with python
I tried to make an activity that collectively sets location information
I tried to make a system that fetches only deleted tweets
[2nd] I tried to make a certain authenticator-like tool with python
[3rd] I tried to make a certain authenticator-like tool with python
[Python] A memo that I tried to get started with asyncio
I tried to make a periodical process with Selenium and Python
I tried to implement sentence classification by Self Attention with PyTorch
I tried a tool that imitates Van Gogh's pattern with AI
I tried to make a todo application using bottle with python
[4th] I tried to make a certain authenticator-like tool with python
Introduction to Deep Learning (2) --Try your own nonlinear regression with Chainer-
[1st] I tried to make a certain authenticator-like tool with python
I tried to make a strange quote for Jojo with LSTM
I tried to make an image similarity function with Python + OpenCV
I tried to make a mechanism of exclusive control with Go
Make ASCII art with deep learning
I tried machine learning with liblinear
Deep learning learned by implementation 1 (regression)