It is assumed that you have achieved reinforcement learning up to 10. If you google with openai acrobot, Acrobot-v1 will come out. I'm not sure about v1 or v0, so I'll investigate before remodeling. userfolder/anaconda3/envs/chainer/lib/python3.7/site-packages/gym Open in VS Code. A full search on CartPole found CartPole-v0 and CartPole-v1. Hmmm? In acrobot, only Acrobot-v1. I tried running CartPole-v0 instead of CartPole-v1 with the CartPole I made earlier. It seems that the difficulty level is increasing.
I replaced it as it was, but something was different. .. .. .. Acrobot is a pendulum movement, and you can get a reward for success by bringing it to a certain height. Let's set it so that the value in the future is not discounted so much. I set gamma to 0.99 and it seems to be working.
I am using DQN (Deep Q Network). There are many explanations, so it's a good idea to google.
Recommended Posts