[PYTHON] [Reinforcement learning] How to draw OpenAI Gym on Google Corab (2020.6 version)

0. Introduction

I checked how to draw OpenAI Gym on Google Colab, so make a note.

Referenced sites

1. Challenges

I get a NoSuchDisplayException error when trying to display the environment with therender ()method of gym.Env.

import gym
env = gym.make('CartPole-v1')
env.reset()
env.render()
NoSuchDisplayException                    Traceback (most recent call last)
<ipython-input-3-74ea9519f385> in <module>()
      2 env = gym.make('CartPole-v1')
      3 env.reset()
----> 4 env.render()

2. Countermeasures

As far as I investigated, I found that there are three ways to use Gym's drawing function on Colab. Each method has advantages and disadvantages, and I could not narrow down to one, so I will describe all three types.

2.1 Common preparation

All three methods use the X11 virtual display Xvfb. install.

!apt update
!apt install xvfb

(When starting Jupyter Notebook independently with Docker image etc., OpenGL related is also required, so ʻapt install python-opengl`. )

Furthermore, in order to use Xvfb from Google Colab (Jupyter Notebook), use PyVirtualDisplay.

!pip install pyvirtualdisplay

from pyvirtualdisplay import Display

d = Display()
d.start()

There was a description that {display number}. {Screen number} was set in the " DISPLAY " environment variable, but [I was told by the author of PyVirtualDisplay that it is unnecessary](https: // github.com/ponty/PyVirtualDisplay/issues/54).

According to him, the screen number is a value used when there are multiple displays, and since PyVirtualDisplay generates only one screen, it is fixed to 0, and if the screen number is not written, it is automatically interpreted as 0. Because of that. (See StackOverflow)

In other words, since the environment variable is set in pyvirtualdisplay.Display.start (), it is not necessary to change it from the outside. (At least confirmed in 1.3.2, the latest version as of June 18, 2020)

2.2 Method 1

The first is to simply draw the screen data with matplotlib and repeat erasing.

The disadvantage is that it is not very fast and is displayed only once, but it is a method that can handle even if the drawing data becomes long because it keeps overwriting without retaining the drawing data.

import gym
from IPython import display
from pyvirtualdisplay import Display
import matplotlib.pyplot as plt

d = Display()
d.start()

env = gym.make('CartPole-v1')

o = env.reset()

img = plt.imshow(env.render('rgb_array'))
for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN

    display.clear_output(wait=True)
    img.set_data(env.render('rgb_array'))
    plt.axis('off')
    display.display(plt.gcf())

    if d:
        env.reset()

2.3 Method 2

The second is to use matplotlib.animation.FuncAnimation to display the animation.

While the drawing screen can be displayed repeatedly and the display speed for each frame can be set freely, it requires a lot of memory because it is necessary to retain drawing data, and the screen size to be displayed and the number of displays must be adjusted. Can cause memory errors. (If you get an error during long learning ...)

import gym
from IPython import display
from pyvirtualdisplay import Display
import matplotlib.pyplot as plt
from matplotlib import animation


d = Display()
d.start()

env = gym.make('CartPole-v1')

o = env.reset()

img = []
for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN

    display.clear_output(wait=True)
    img.append(env.render('rgb_array'))

    if d:
        env.reset()

dpi = 72
interval = 50 # ms

plt.figure(figsize=(img[0].shape[1]/dpi,img[0].shape[0]/dpi),dpi=dpi)
patch = plt.imshow(img[0])
plt.axis=('off')
animate = lambda i: patch.set_data(img[i])
ani = animation.FuncAnimation(plt.gcf(),animate,frames=len(img),interval=interval)
display.display(display.HTML(ani.to_jshtml()))

2.4 Method 3

The last method is to save the drawing data as a movie using gym.wrappers.Monitor. The render () method is not required and is automatically saved when you call the step (action) method.

import base64
import io
import gym
from gym.wrappers import Monitor
from IPython import display
from pyvirtualdisplay import Display

d = Display()
d.start()

env = Monitor(gym.make('CartPole-v1'),'./')

o = env.reset()

for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN

    if d:
        env.reset()

for f in env.videos:
    video = io.open(f[0], 'r+b').read()
    encoded = base64.b64encode(video)

    display.display(display.HTML(data="""
        <video alt="test" controls>
        <source src="data:video/mp4;base64,{0}" type="video/mp4" />
        </video>
        """.format(encoded.decode('ascii'))))

3. Library: Gym-Notebook-Wrapper

Since it is troublesome to write the above method every time, I made it into a library.

3.1 Installation

It's published on PyPI, so you can install it with pip install gym-notebook-wrapper.

!apt update && apt install xvfb
!pip install gym-notebook-wrapper

Of course, it can be used other than Google Colab, but Linux is a prerequisite for using Xvfb.

3.2 How to use

The gym-notebook-wrapper has a long hyphen (-), so the module name that can be imported is gnwrapper.

3.2.1 gnwrapper.Animation (= 2.2 Method 1)

import gnwrapper
import gym

env = gnwrapper.Animation(gym.make('CartPole-v1')) #Xvfb is started

o = env.reset()

for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN
    env.render() #Here, the previous drawing is erased and a new step is drawn.
    if d:
        env.reset()

3.2.2 gnwrapper.LoopAnimation (= 2.3 Method 2)

import gnwrapper
import gym

env = gnwrapper.LoopAnimation(gym.make('CartPole-v1')) #Xvfb is started

o = env.reset()

for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN
    env.render() #Now save the drawing data
    if d:
        env.reset()

env.display() #Here, the saved drawing data is displayed as an animation.

3.2.3 gnwrapper.Monitor (= 2.4 Method 3)

import gnwrapper
import gym

env = gnwrapper.Monitor(gym.make('CartPole-v1'),directory="./") #Xvfb is started

o = env.reset()

for _ in range(100):
    o, r, d, i = env.step(env.action_space.sample()) #Actually put the action from DNN
    if d:
        env.reset()

env.display() #Here, the drawing data saved as a video is displayed.

4. Finally

I organized various information on the net and summarized three ways to draw OpenAI Gym on Google Colab. It should be the code that I actually ran and confirmed several times, but I'm sorry if I did a copy pemis.

Gym-Notebook-Wrapper is still rough and may have bugs, so feel free to set up issue if you have any questions. I'm glad if you get it.

Recommended Posts

[Reinforcement learning] How to draw OpenAI Gym on Google Corab (2020.6 version)
How to use Google Assistant on Windows 10
[2020 version] How to install Python3 on AWS EC2
Reinforcement learning 3 OpenAI installation
How to use Django on Google App Engine / Python
Try to make a blackjack strategy by reinforcement learning (③ Reinforcement learning in your own OpenAI Gym environment)
How to easily draw the structure of a neural network on Google Colaboratory using "convnet-drawer"
Reinforcement learning in the shortest time with Keras with OpenAI Gym
[Google Colab] How to interrupt learning and then resume it
Reinforcement learning 28 colaboratory + OpenAI + chainerRL
Tabpy 1.0 (2020-01 version) How to install
How to register on pypi
How to change Python version
Deep Reinforcement Learning 1 Introduction to Reinforcement Learning
How to use Google Colaboratory
How to update the python version of Cloud Shell on GCP
How to install mysql-connector-python on mac
How to use Dataiku on Windows
Notes on how to use pywinauto
How to install graph-tool on macOS
How to install VMware-Tools on Linux
How to install pycrypto on Windows
How to deploy django-compressor on Windows
How to get the Python version
Notes on how to use featuretools
How to run matplotlib on heroku
How to install PyPy on CentOS
How to use homebrew on Debian
[Memo] How to use Google MµG
Misunderstanding on how to connect cnn
How to install TensorFlow on CentOS 7
How to check Linux OS version
Notes on how to use doctest
How to install Maven on CentOS
Notes on how to write requirements.txt
How to install Go on Ubuntu
How to install music 21 on windows
How to collect machine learning data
How to study for the Deep Learning Association G test (for beginners) [2020 version]
How to install Deep Learning framework Caffe on Mac in CPU mode
How to draw a vertical line on a heatmap drawn with Python seaborn