[PYTHON] [Mac] I tried reinforcement learning with OpenAI Baselines

The open source of reinforcement learning announced by OpenAI on May 24, 2017 seems to be easy to use, so I tried using it: smiley: "DQN" and its three variations of reinforcement learning algorithms released by artificial intelligence research group "OpenAI" Baselines github

The following two are introduced as tutorials to be executed baselines.deepq.experiments.train_cartpole baselines.deepq.experiments.train_pong


Mac OS Sierra 10.12.4 python 3.6.1 By the way, note that it cannot be done with python2.7 series.

cartpole This tutorial seems to be a game to prevent the stick of the dolly from falling For the time being, execute without thinking

It is said that there are not enough modules, so install it hard

#Commands for learning
python -m baselines.deepq.experiments.train_cartpole

#Command to play with the model of learning result
python -m baselines.deepq.experiments.enjoy_cartpole

Reinforcement learning ... By the way, episodes stopped at 690. image.png

I tried playing

How do you play this ...: thinking: If it is determined that the black object has fallen, it feels like it has been reloaded, but it makes no sense. image.png

pong This seems to be a competitive game like ice hockey

The module that is not enough here is also installed By the way, if you are told that there is no cv2, it means OpenCV, so you should refer to the following. Make OpenCV3 available from python3 installed with pyenv

As you can see from the article below, opencv settings are quite troublesome on linux. The easiest way to use OpenCV with python So I switched to anaconda and ran it. (Recommended because it can be done soon)

#Commands for learning
python -m baselines.deepq.experiments.train_pong

#Commands for playing with the model of the learning result
python -m baselines.deepq.experiments.enjoy_pong

I'm learning ... image.png

If 1 episode is 90 seconds and is repeated 690 times, it will take about 62,100 seconds, 17 hours and 15 minutes ...

~~ I stopped halfway through ~~

I tried: innocent:

There were about 1160 episodes, so it took a long time ... I can't say anything because it stopped in sleep mode on the way, but I think it took about 8 hours. Screenshot from 2017-05-28 21-43-00.png

Please check the video below for the results of playing. [Try OpenAI Baselines on windows (winpython). ] (http://qiita.com/tmizu23/items/ff1d5c89bc99292410c0)

(By the way, I was wondering if this could be a battle with humans vs. machine learning, but that's not the case ... I wanted to fight reinforcement learning ...)

Recommended Posts

[Mac] I tried reinforcement learning with OpenAI Baselines
I tried machine learning with liblinear
I tried reinforcement learning using PyBrain
Reinforcement learning 11 Try OpenAI acrobot with ChainerRL.
I tried deep reinforcement learning (Double DQN) for tic-tac-toe with ChainerRL
Reinforcement learning 3 OpenAI installation
I tried deep learning
I tried to move machine learning (ObjectDetection) with TouchDesigner
Mayungo's Python Learning Episode 1: I tried printing with print
I tried fp-growth with python
Reinforcement learning 28 colaboratory + OpenAI + chainerRL
I tried Learning-to-Rank with Elasticsearch!
I tried clustering with PyCaret
Reinforcement learning starting with Python
I tried gRPC with Python
I tried scraping with python
I tried to create a reinforcement learning environment for Othello with Open AI gym
I tried to build an environment for machine learning with Python (Mac OS X)
Reinforcement learning in the shortest time with Keras with OpenAI Gym
Mayungo's Python Learning Episode 3: I tried to print numbers with print
I tried to divide with a deep learning language model
Reinforcement learning 13 Try Mountain_car with ChainerRL.
I tried summarizing sentences with summpy
I tried web scraping with python.
I tried moving food with SinGAN
I tried face detection with MTCNN
I tried deep learning using Theano
I tried running prolog with python 3.8.2.
I tried SMTP communication with Python
I tried sentence generation with GPT-2
I tried face recognition with OpenCV
PySpark learning record ② Kaggle I tried the Titanic competition with PySpark binding
I tried to make deep learning scalable with Spark × Keras × Docker
I read "Reinforcement Learning with Python: From Introduction to Practice" Chapter 1
I read "Reinforcement Learning with Python: From Introduction to Practice" Chapter 2
Mayungo's Python Learning Episode 7: I tried printing with if, elif, else
DQN with Chainer. I tried various reinforcement learning in tic-tac-toe. (Deep Q Network, Q-Learning, Monte Carlo)
I tried multiple regression analysis with polynomial regression
I tried sending an SMS with Twilio
I tried using Amazon SQS with django-celery
[Reinforcement learning] DQN with your own library
I tried to implement deep learning that is not deep with only NumPy
I tried to implement Autoencoder with TensorFlow
I tried linebot with flask (anaconda) + heroku
[Reinforcement learning] I implemented / explained R2D3 (Keras-RL)
I tried to visualize AutoEncoder with TensorFlow
I tried to get started with Hy
I tried scraping Yahoo News with Python
I tried using Selenium with Headless chrome
[Kaggle] I tried ensemble learning using LightGBM
I tried sending an email with python.
I tried non-photorealistic rendering with Python + opencv
Mayungo's Python Learning Episode 2: I tried to put out characters with variables
[Reinforcement learning] Finally surpassed humans! ?? I tried to explain / implement Agent57 (Keras-RL)
I tried a functional language with Python
I tried batch normalization with PyTorch (+ note)
I tried recursion with Python ② (Fibonacci sequence)
I tried implementing DeepPose with PyTorch PartⅡ
I tried to implement CVAE with PyTorch
I tried playing with the image with Pillow
Mayungo's Python Learning Episode 8: I tried input