The open source of reinforcement learning announced by OpenAI on May 24, 2017 seems to be easy to use, so I tried using it: smiley: "DQN" and its three variations of reinforcement learning algorithms released by artificial intelligence research group "OpenAI" Baselines github

The following two are introduced as tutorials to be executed baselines.deepq.experiments.train_cartpole baselines.deepq.experiments.train_pong

environment

Mac OS Sierra 10.12.4 python 3.6.1 By the way, note that it cannot be done with python2.7 series.

cartpole This tutorial seems to be a game to prevent the stick of the dolly from falling For the time being, execute without thinking

It is said that there are not enough modules, so install it hard

#Commands for learning
python -m baselines.deepq.experiments.train_cartpole

#Command to play with the model of learning result
python -m baselines.deepq.experiments.enjoy_cartpole

Reinforcement learning ... By the way, episodes stopped at 690.

I tried playing

How do you play this ...: thinking: If it is determined that the black object has fallen, it feels like it has been reloaded, but it makes no sense.

pong This seems to be a competitive game like ice hockey

The module that is not enough here is also installed By the way, if you are told that there is no cv2, it means OpenCV, so you should refer to the following. Make OpenCV3 available from python3 installed with pyenv

As you can see from the article below, opencv settings are quite troublesome on linux. The easiest way to use OpenCV with python So I switched to anaconda and ran it. (Recommended because it can be done soon)

#Commands for learning
python -m baselines.deepq.experiments.train_pong

#Commands for playing with the model of the learning result
python -m baselines.deepq.experiments.enjoy_pong

I'm learning ...

If 1 episode is 90 seconds and is repeated 690 times, it will take about 62,100 seconds, 17 hours and 15 minutes ...

~~ I stopped halfway through ~~

I tried: innocent:

There were about 1160 episodes, so it took a long time ... I can't say anything because it stopped in sleep mode on the way, but I think it took about 8 hours. Screenshot from 2017-05-28 21-43-00.png

Please check the video below for the results of playing. [Try OpenAI Baselines on windows (winpython). ] (http://qiita.com/tmizu23/items/ff1d5c89bc99292410c0)

(By the way, I was wondering if this could be a battle with humans vs. machine learning, but that's not the case ... I wanted to fight reinforcement learning ...)

Recommended Posts

[Mac] I tried reinforcement learning with OpenAI Baselines

I tried machine learning with liblinear

I tried reinforcement learning using PyBrain

Reinforcement learning 11 Try OpenAI acrobot with ChainerRL.

I tried deep reinforcement learning (Double DQN) for tic-tac-toe with ChainerRL

Reinforcement learning 3 OpenAI installation

I tried deep learning

I tried to move machine learning (ObjectDetection) with TouchDesigner

Mayungo's Python Learning Episode 1: I tried printing with print

I tried fp-growth with python

Reinforcement learning 28 colaboratory + OpenAI + chainerRL