[PYTHON] TF2RL: Reinforcement learning library for TensorFlow2.x

An introductory article about the library TF2RL for reinforcement learning developed by a friend @ohtake_i. I am also helping out (issue response, PR creation). (It is no exaggeration to say that my replay buffer library cpprb was also created for TF2RL.)

1.First of all

As the name suggests, it is a library for reinforcement learning written in TensorFlow 2 system. The TensorFlow 1 series had some parts that were difficult to get to, such as Session and placeholder, but since it is quite easy to write in the TensorFlow 2 series, "TensorFlow is difficult to write, so [PyTorch](https:: //pytorch.org/) I want people who think "choice" to see it.

2. Implemented algorithm

The latest status can be found in the [README] on GitHub (https://github.com/keiohta/tf2rl#algorithms). As of August 26, 2020, the following algorithms have been implemented. (We plan to increase it from time to time.)

Some algorithms also support ApeX and GAE.

3. Installation method

The version of TensorFlow should support from 2.0 to the latest 2.3. Due to the version of TensorFlow, TensorFlow is not installed as a dependent library by default, so it will be pip or conda. Please install it as / latest /). (Of course, the GPU version is also OK. Since 2.1, the PyPI binary is no longer distinguished between the CPU / GPU version, so I think that the chances of worrying about it will decrease in the future. But)

For pip


pip install tensorflow

For conda


conda install -c anaconda tensorflow

TF2RL is public on PyPI, so you can install it with pip.

pip install tf2rl

4. How to use

The following is a code example using DDPG described in README. Build an algorithm agent and pass it to Trainer along with the environment (gym.Env) to learn according to the algorithm.

Example of Pendulum in DDPG


import gym
from tf2rl.algos.ddpg import DDPG
from tf2rl.experiments.trainer import Trainer


parser = Trainer.get_argument()
parser = DDPG.get_argument(parser)
args = parser.parse_args()

env = gym.make("Pendulum-v0")
test_env = gym.make("Pendulum-v0")
policy = DDPG(
    state_shape=env.observation_space.shape,
    action_dim=env.action_space.high.size,
    gpu=-1,  # Run on CPU. If you want to run on GPU, specify GPU number
    memory_capacity=10000,
    max_action=env.action_space.high[0],
    batch_size=32,
    n_warmup=500)
trainer = Trainer(policy, env, args, test_env=test_env)
trainer()

You can check the learning results on the TensorBoard.

tensorboard --logdir results

Some parameters can be passed as command line options at program runtime via ʻargparse`.

5. Challenges and future

Originally it was supposed to execute scripts & commands, so Trainer is tightly bound to ʻargparseand works well in notebook environments such as [Google Colab](https://colab.research.google.com/). Cannot be executed. (Since agents other thanTrainerwork without any problem, it is possible to write a loop for using only the model & learning by scratch.) I'd like to put a scalpel inTrainer` and do something about it.

in conclusion

For some reason, it seems that there are many people who seem to be Chinese who actively give feedback. I think that the number of Japanese users will increase, so I would be very happy if you could give me feedback (issue, PR) after using it.

Recommended Posts

TF2RL: Reinforcement learning library for TensorFlow2.x
Reinforcement learning for tic-tac-toe
<For beginners> python library <For machine learning>
[Reinforcement learning] DQN with your own library
[Reinforcement learning] Search for the best route
[Introduction] Reinforcement learning
Future reinforcement learning_2
Future reinforcement learning_1
Reinforcement learning 1 Python installation
Reinforcement learning 3 OpenAI installation
[Reinforcement learning] Bandit task
Machine learning library dlib
Python + Unity Reinforcement Learning (Learning)
Summary for learning RAPIDS
Machine learning library Shogun
Reinforcement learning 1 introductory edition
[Introduction to Reinforcement Learning] Reinforcement learning to try moving for the time being
Data set for machine learning
Reinforcement learning 18 Colaboratory + Acrobat + ChainerRL
Japanese preprocessing for machine learning
Reinforcement learning 7 Learning data log output
Play with reinforcement learning with MuZero
Learning flow for Python beginners
Reinforcement learning 17 Colaboratory + CartPole + ChainerRL
Reinforcement learning 28 colaboratory + OpenAI + chainerRL
Python learning plan for AI learning
Reinforcement learning 19 Colaboratory + Mountain_car + ChainerRL
Reinforcement learning 2 Installation of chainerrl
[Reinforcement learning] Tracking by multi-agent
Reinforcement learning 6 First Chainer RL
Reinforcement learning starting with Python
Reinforcement learning 20 Colaboratory + Pendulum + ChainerRL
Learning memorandum for me w
Reinforcement learning 5 Try programming CartPole?
Reinforcement learning 9 ChainerRL magic remodeling
Deep learning for compound formation?
Reinforcement learning Learn from today
Checkio's recommendation for learning Python
Reinforcement learning 4 CartPole first step
Deep Reinforcement Learning 1 Introduction to Reinforcement Learning
Deep reinforcement learning 2 Implementation of reinforcement learning
DeepMind Reinforcement Learning Framework Acme
Reinforcement learning: Accelerate Value Iteration
[Reinforcement learning] DeepMind Experience Replay library Reverb usage survey [Client edition]
I tried deep reinforcement learning (Double DQN) for tic-tac-toe with ChainerRL
Beginners want to make something like a Rubik's cube with UE4 and make it a library for reinforcement learning # 4
Beginners want to make something like a Rubik's cube with UE4 and make it a library for reinforcement learning # 5
Beginners want to make something like a Rubik's cube with UE4 and make it a library for reinforcement learning # 6