[ML-Aents] I tried machine learning using Unity and Python TensorFlow (v0.11β compatible)

Introduction

I couldn't find the Japanese article of v0.11.0, so I have a memorandum.

This article is __for beginners __. Unity beginners imitate one of the official tutorials on ML-Agents do it It is one of the machine learning, __ Reinforcement Learning __.

__ I will make something like this. : arrow_up: __

It's for people who haven't done machine learning yet, although they know how to make Unity work easily. Rather than focusing on theory, we are introducing it so that you can experience it while moving your hands.

* This article is current as of November 13, 2019. ML-Agents are undergoing rapid version upgrades, so always check for the latest information. ~~ [Book published last year](https://www.amazon.co.jp/Unity%E3%81%A7%E3%81%AF%E3%81%98%E3%82%81%E3%82 % 8B% E6% A9% 9F% E6% A2% B0% E5% AD% A6% E7% BF% 92% E3% 83% BB% E5% BC% B7% E5% 8C% 96% E5% AD% A6 % E7% BF% 92-Unity-ML-Agents% E5% AE% 9F% E8% B7% B5% E3% 82% B2% E3% 83% BC% E3% 83% A0% E3% 83% 97% E3 % 83% AD% E3% 82% B0% E3% 83% A9% E3% 83% 9F% E3% 83% B3% E3% 82% B0-% E5% B8% 83% E7% 95% 99% E5% B7% 9D-% E8% 8B% B1% E4% B8% 80 / dp / 48624648181) didn't help ~~ (Transition of this year ⇒ January 2019: * v0.6 * ➡ April: * v0.8 * ➡ October: * v0.10 * ➡ As of November: * v0.11 *)

Roughly the point

Here are some essential words for doing machine learning in Unity. That is __ "Academy", "Brain", and "Agent" __.

Basically, in the environment defined by "Academy" in Unity, "Brain" controls the actions taken by "Agent". This time, we will perform reinforcement learning via an external TensorFlow (Python framework), and load the generated neural network model in Unity and execute it. (This is a simple tutorial, so I won't touch the Academy very much.)

Major changes from version 0.10.0

__ If you are new to this, you can skip it. __ I used v0.8x and v0.9x, but I'm not sure because I can't find Brain Parameters, but if you're just looking here, maybe it's okay.

-* Broadcast Hub * is abolished. -* Brain Scriptable Objects * is abolished. ⇒ Change to * Behavior Parameters * item -Major setup change of * Visual Observation *. --Renewed definition of gRPC. --Abolition of online BC training.

Execution environment

Windows10

Unity 2019.1.4f1

ML-Agents Beta 0.11.0

Python 3.6（Anaconda）

Preparation

Please install the following first.

- Unity5 (ver is 2017.4 migration, I think there is no problem)

ML-Agents v0.11.0 - Anaconda 2019.10 (Please select 3.7__ for the Python version)

Project creation

Launch Unity and create a project called Roller Ball.

File-> Build Settings ...-> Player Settings ...-> Other Settings-> Configuration *

Scripting Runtime Version * and * Api Compatibility Level * are * .NET 4.x Equivalent * and * .NET, respectively. 4. Make sure it is x *.

Load ML-Agents assets into your project. Located in the downloaded ml-agents-master \ UnitySDK \ Assets

D & D * the ML-Agents folder into your project.

Stage creation

Creating a floor

-* 3D Object> Plane * to create a plane. --Name the * Plane * you created as Floor. --* Transform * of Floor

Position = (0, 0, 0)

Rotation = (0, 0, 0)

Scale = (1, 1, 1) To -Play with * Element * of * Inpector *> * Materials * to make it look like you like.

Creating a box (Target)

-* 3D Object> Cube * to create a cube. --Name the created * Cube * to Target. --* Transform * of Target

Position = (3, 0.5, 3)

Rotation = (0, 0, 0)

Scale = (1, 1, 1) To --Similar to Floor, you can change the appearance to your liking.

Creating a soccer ball (Agent)

-* 3D Object> Sphere * to put out a sphere. --Name the * Sphere * you created as RollerAgent. --* Transform * of RollerAgent

Position = (0, 0.5, 0)

Rotation = (0, 0, 0)

Scale = (1, 1, 1) To ――As before, change the appearance to your liking. If you want it to look like a ball, choose the CheckerSquare material. -Add * Rigidbody * from * Add Component *.

Creating an empty object (Academy)

-* Create Empty * will bring out an empty * GameObject *. --Name the * GameObject * you created to ʻAcademy`.

Next, I will describe the contents in C #.

Implementation of Academy (Implement an Academy)

-With ʻAcademyselected in the * Hierarchy * window, use * Add Component-> New Script * to create a script namedRollerAcademy.cs. --Rewrite the contents of RollerAcademy.cs` to the following. You can erase the original contents.

RollerAcademy.cs

using MLAgents; public class RollerAcademy : Academy{ }

In this description, Basic functions such as "observation-decision-action-action" (omitted here) are inherited from the * Academy * class to the * RollerAcademy * class. So it's okay with two lines.

Implementation an Agent

Select RollerAgent in the * Hierarchy * window and select Create a script named RollerAgent.cs with * Add Component-> New Script *.

Inheritance the * Base *

Rewrite the contents of RollerAgent.cs as follows.

RollerAgent.cs

using MLAgents; public class RollerAgent : Agent{ }

Like * Academy *, it reads the namespace * MLAgents * and specifies * Agent * as the base class to inherit it.

This is the basic procedure for incorporating ML-agents into __Unity. Next, we will add a mechanism for the ball to charge toward the box by reinforcement learning.

Initialization and Resetting

Rewrite the contents of RollerAgent.cs as follows.

RollerAgent.cs

using unityEngine; using MLAgents; public class RollerAgent:Agent { Rigidbody rBody; void Start(){ rBody = GetComponent<Rigidbody>(); } public Transform Target; public override void AgentReset() { if (this.transform.position.y < 0) { //Rotational acceleration and acceleration reset this.rBody.angularVelocity = Vector3.zero; this.rBody.velocity = Vector3.zero; //Return agent to initial position this.transform.position = new Vector3( 0, 0.5f, 0) } //Target relocation Target.position = new Vector3(Random.value * 8 - 4, 0.5f, Random.value * 8 - 4); } }

here,

--Next __ relocation and initialization __ when RollerAgent reaches the box (Target) --__Return __ when RollerAgent falls off the floor (Floor)

Is being processed.

Rigidbody is a component used in Unity's physics simulation. This time it will be used to run the agent. The values of * Position, Rotation, Scale * are recorded in Transform. By defining it as public, * Inpector * can pass Transform of * Target *.

Observing the Environment

Add the following in the class of RollerAgent.cs.

public override void CollectObservations() { //Target and agent location AddVectorObs(Target.position); AddvectorObs(This.transform.position); //Agent speed AddVectorObs(rBody.velocity.x); AddVectorObs(rBody.velocity.z); }

here, __ Processing to collect observed data as a feature vector __ I am doing.

The 3D coordinates of * Target * and * Agent * and the total 8D vectors of * Agent * velocities * x * and * z * are passed to the neural network. ~~ 8 dimensions is cool to express ~~

Actions and Rewards

Add the following processing related to the ʻAgentAction ()function toRollerAgent.cs`.

public float speed = 10 public override void AgentAction(float[] vectorAction, string textAction) { //Action Vector3 controlSignal = Vector3.zero; controlSignal.x = vectorAction[0]; controlSignal.z = vectorAction[1]; rBody.AddForce(controlSignal * speed); //Reward //Get the distance to the box (target) from the distance the ball (agent) moved float distanceToTarget = Vector3.Distance(this.transform.position, Target.position); //When the box (target) is reached if (distanceToTarget < 1.42f) { //Rewarded and completed SetReward(1.0f); Done(); } //If you fall off the floor if (this.transform.position.y < 0) { Done(); } }

here, __The "action" of reading the two types of forces (continuous values) applied in the X and Z directions and trying to move the agent The learning algorithm processes __ which gives a "reward" when the agent can reach the box safely and picks up the "reward" when it falls.

The ʻAddForce` function is a function for applying physical force to an object that has a * Rigidbody * component and moving it. Only when the distance below the reference value to judge whether the target has been reached is calculated, the reward will be given and the reset will be performed.

In order to get enough learning in more complicated situations, it is effective not only to take up the reward but also to punish it. ~~ (At v0,5x, it was -1 when it fell off the floor, but it seems that it was judged unnecessary in the latest version) ~~

In summary, RollerAgents.cs looks like this:

RollerAgents.cs

using unityEngine; using MLAgents; public class RollerAgent:Agent { Rigidbody rBody; void Start(){ rBody = GetComponent<Rigidbody>(); } public Transform Target; public override void AgentReset() { if (this.transform.position.y < 0) { //Rotational acceleration and acceleration reset this.rBody.angularVelocity = Vector3.zero; this.rBody.velocity = Vector3.zero; //Return agent to initial position this.transform.position = new Vector3( 0, 0.5f, 0) } //Target relocation Target.position = new Vector3(Random.value * 8 - 4, 0.5f, Random.value * 8 - 4); } public override void CollectObservations() { //Target and agent location AddVectorObs(Target.position); AddvectorObs(This.transform.position); //Agent speed AddVectorObs(rBody.velocity.x); AddVectorObs(rBody.velocity.z); } public override void AgentAction(float[] vectorAction, string textAction) { //Action Vector3 controlSignal = Vector3.zero; controlSignal.x = vectorAction[0]; controlSignal.z = vectorAction[1]; rBody.AddForce(controlSignal * speed); //Reward //Get the distance to the box (target) from the distance the ball (agent) moved float distanceToTarget = Vector3.Distance(this.transform.position, Target.position); //When the box (target) is reached if (distanceToTarget < 1.42f) { //Rewarded and completed SetReward(1.0f); Done(); } //If you fall off the floor if (this.transform.position.y < 0) { Done(); } } }

Finish on the Unity editor

-Select RollerAgent in the * Hierarchy * window and change theRollerAgent (Script)item by two points. Decision Interval = 10 Target = Target(Transform)

-Add * Add Component> Behavior Parameters * and change the settings as follows.

Behavior Name = RollerBallBrain Vector Observation Space Size = 8 Vector Action Space Type = Continuous Vector Action Space Size = 2

Also, according to the Official Documentation (https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Learning-Environment-Create-New.md), if you continue to use the default parameters 30 It seems that it takes time to learn 10,000 steps. This time it's not that complicated, so let's rewrite some of the parameters to reduce the number of trials to less than 20,000 steps.

-Open trainer_config.yaml in * ml-agents-master-0.11> config> * with an editor (VS code or Notepad) and rewrite the values of the following items.

batch_size: 10 buffer_size: 100

Now you are ready to train.

Manual test

It's almost time to get here. Before reinforcement learning, let's manually check whether the environment created so far works properly. Implement the following method additionally in the class of RollerAgent.cs.

public override float[] Heuristic() { var action = new float[2]; action[0] = Input.GetAxis("Horizontal"); action[1] = Input.GetAxis("Vertical"); return action; }

Horizontal (horizontal) input axis with Horizontal, Allows Vertical to accept vertical (vertical) input axes.

You can now use the "W", "A", "S", "D" or arrow keys.

Finally, in the Roller Agent * Inspector *, Select the * Use Heuristic * check box under * Behavior Parameters *.

Press Play to run it. If you can confirm that it works by key input, it is successful.

Learn with TensorFlow

Now, let's move on to the learning step.

Environment construction / library installation

First, launch Anaconda Prompt. You can find it immediately by searching from the start menu (Win key).

conda create -n ml-agents python=3.6

Enter to build a virtual environment. [^ 1]

Proceed([y]/n)?

You will be asked if you want to install it, so enter y. continue,

activate ml-agents

Enter to move to the virtual environment. [^ 2] Make sure you have (ml-agents) at the beginning of the command line.

cd ＜ml-agent folder ＞

Go to. [^ 3]

pip install mlagents

Install the library that ML-Agents uses independently. (It takes a few minutes) This installation makes dependencies such as TensorFlow / Jupyter.

After a while, It is OK if a screen like this appears.

cd ＜ml-agents folder ＞\ml-agents-envs

Go to.

pip install -e .

To install the package. It is OK if the screen looks like this. And

cd ＜ml-agents folder ＞\ml-agents

Go to.

pip install -e .

To install the package.

This completes the preparation on the Python side.

__ : collision: [Note]: The TensorFlowSharp plugin is not used in v0.6.x or later. __ If you have been referring to old books, we recommend that you recreate a new virtual environment.

Until ML-Agents ver0.5.0, TensorFlowSharp was used to communicate with Python, but please do not use it in the latest version. If you use it, the following error will occur.

No model was present for the Brain 3DBallLearning. UnityEngine.Debug:LogError(Object) MLAgents.LearningBrain:DecideAction() (at Assets/ML-Agents/Scripts/LearningBrain.cs:191) MLAgents.Brain:BrainDecideAction() (at Assets/ML-Agents/Scripts/Brain.cs:80) MLAgents.Academy:EnvironmentStep() (at Assets/ML-Agents/Scripts/Academy.cs:601) MLAgents.Academy:FixedUpdate() (at Assets/ML-Agents/Scripts/Academy.cs:627)

(Source)

Reinforcement Learning

Well, finally we will start learning. The dream AI experience is just around the corner. let's do our best.

cd ＜ml-agents> folder

Enter to move to the downloaded folder hierarchy.

mlagents-learn config/trainer_config.yaml --run-id=firstRun --train

To execute. [^ 4] At the bottom of the command line, __INFO:mlagents.envs:Start training by pressing the Play button in the Unity Editor. (Go back to the Unity editor and press the Play button to start training.) __ Is displayed.

Go back to the Unity screen and uncheck * Use Heuristic * in __ * Behavior Parameters * and press the __,: arrow_forward: button.

When the ball started chasing the box, learning started normally.

__ If you do not press the Play button for a while, a timeout error will occur, so please execute the same command again. __

The log is output to the console log every 1000 steps. If you want to interrupt in the middle, you can interrupt with Ctrl + C. (If you dare to finish early, you can make a "weak AI")

__Step is the number of trials (learning), __ __Mean Reward earned average reward, __ __Std of Reward is standard deviation __ (value representing data variability) Represents.

After learning, the RollerBallBrain.nn file will be created under<ml-agents folder> \ models \ <id name ~>.

Learning reflection

Now it's time to try out the model of the generated neural network.

Copy the RollerBallBrain.nn file from earlier to the * Assets * folder in Unity's Project. (The location can be anywhere in the project)

Then click the: radio_button: button on the far right of the * Model * item in the * Inspector * of the RollerAgent and select the imported .nn file. (* At this time, be careful not to confuse if there is a .nn extension file with the same name.)

Also, if * Use Heuristic * in * Behavior Parameters * is left checked, it will not work properly. __ Be sure to uncheck it after the test. __

Now let's press: arrow_forward: Play.

__ If the ball starts chasing you safely, you are successful. __

(Bonus) Observe the transition graph with TensorBoard

In Anaconda Prompt, do the following:

tensorboard --logdir=summaries --port=6006

If you open [localhost: 6006](http: // localhost: 6006 /) in your browser, you can see the transition of learning in a graph.

Summary

――If you can read more Gorigori C #, you will be able to fine-tune the algorithm yourself __ ――In reinforcement learning, __AI's wisdom can be classified into weak, medium, strong, etc. by the number of learnings __ --Ver is frequently renewed, __ information is apt to deteriorate __ ―― ~~ Learning is much faster than humans. The power of science is amazing! !! ~~

Even beginners can use assets to create a convenient world where simple machine learning can be imitated in a day. How was it when you actually touched it? I hope it will give you an opportunity to become interested in machine learning.

If you find any expressions or errors that you are interested in, I would appreciate it if you could point them out. Also, if you found this article helpful, I like it! It will be __encouragement if you give me.

Thank you for your cooperation.

reference

Below are articles from our ancestors who have been very helpful in learning. I would like to take this opportunity to say __Acknowledgment __.

Unity-Technologies Official Document (GitHub) ml-agents Migration Guide (GitHub) [Unity: How to use ML-Agents in September 2019 (ver0.9.0 /0.9.1/0.9.2)](https://www.fast-system.jp/unity-ml-agents-version-0- 9-0-how to /) [Unity] I tried a tutorial on reinforcement learning (ML-Agents v0.8.1) [Create a new learning environment with Unity's ML-Agents (0.6.0a version)](http://am1tanaka.hatenablog.com/entry/2019/01/18/212915#%E5%AD%A6%E7 % BF% 92% E5% 8A% B9% E6% 9E% 9C% E3% 82% 92% E9% AB% 98% E3% 82% 81% E3% 82% 8B% E3% 81% 8A% E3% 81 % BE% E3% 81% 91)

[^ 1]: * You can change the "ml-agents" * part to any name you like. [^ 2]: Activate with the virtual environment name you set [^ 3]: Directory where * ml-agents-master * was downloaded in Preparation [^ 4]: * You can change the part of "firstRun" * to any name you like.

Recommended Posts
[ML-Aents] I tried machine learning using Unity and Python TensorFlow (v0.11β compatible)

What I learned about AI and machine learning using Python (4)

I tried web scraping using python and selenium

I tried object detection using Python and OpenCV

What I learned about AI / machine learning using Python (1)

What I learned about AI / machine learning using Python (3)

What I learned about AI / machine learning using Python (2)

I tried to compress the image using machine learning

[Python] I tried using OpenPose

I tried using magenta / TensorFlow

I tried hosting a TensorFlow deep learning model using TensorFlow Serving

I tried using Tensorboard, a visualization tool for machine learning

I tried machine learning with liblinear

I tried using Thonny (Python / IDE)

I tried reinforcement learning using PyBrain

I tried deep learning using Theano

[Python] I tried using YOLO v3

Pydroid 3 --I tried OpenCV and TensorFlow options for IDE for Python 3 (Android)

I tried using Twitter api and Line api

I installed Python 3.5.1 to study machine learning

[Kaggle] I tried ensemble learning using LightGBM

I tried using PyEZ and JSNAPy. Part 2: I tried using PyEZ

I tried using Bayesian Optimization in Python

I tried to classify text using TensorFlow

I tried using UnityCloudBuild API from Python

Python and machine learning environment construction (macOS)

Mayungo's Python Learning Episode 8: I tried input

I tried to classify guitar chords in real time using machine learning

I tried to create a sample to access Salesforce using Python and Bottle

I tried updating Google Calendar with CSV appointments using Python and Google APIs

I tried using PyEZ and JSNAPy. Part 1: Overview

A story about simple machine learning using TensorFlow

[Python] Deep Learning: I tried to implement deep learning (DBN, SDA) without using a library.

Python programming: I tried to get (crawling) news articles using Selenium and BeautifulSoup4.

I tried Jacobian and partial differential with python

I tried using mecab with python2.7, ruby2.3, php7

I tried function synthesis and curry with python

I tried using Google Translate from Python and it was just too easy

I tried reading a CSV file using Python

I tried using the Datetime module by Python

[Python3] Let's analyze data using machine learning! (Regression)

I tried to implement various methods for machine learning (prediction model) using scikit-learn.

I tried to process and transform the image and expand the data for machine learning

Try using tensorflow ① Build python environment and introduce tensorflow

I started machine learning with Python Data preprocessing

I tried to make a ○ ✕ game using TensorFlow

I tried to make a real-time sound source separation mock with Python machine learning

Code review with machine learning Amazon Code Guru now supports Python so I tried it

I tried to build an environment for machine learning with Python (Mac OS X)

I tried to move machine learning (ObjectDetection) with TouchDesigner

I tried pipenv and asdf for Python version control

I tried using google test and CMake in C

[Python] I immediately tried using Pylance's VS Code extension.

I tried using TradeWave (BitCoin system trading in Python)

Python learning memo for machine learning by Chainer Chapters 1 and 2

I tried learning my own dataset using Chainer Trainer

[Python] I made a classifier for irises [Machine learning]

Mayungo's Python Learning Episode 1: I tried printing with print

[Python] I tried running a local server using flask

I tried drawing a pseudo fractal figure using Python

I tried to implement Grad-CAM with keras and tensorflow

[ML-Aents] I tried machine learning using Unity and Python TensorFlow (v0.11β compatible)

Introduction

Roughly the point

Major changes from version 0.10.0

Execution environment

Preparation

Project creation

Stage creation

Creating a floor

Creating a box (Target)

Creating a soccer ball (Agent)

Creating an empty object (Academy)

Implementation of Academy (Implement an Academy)

`RollerAcademy.cs`

Implementation an Agent

Inheritance the * Base *

`RollerAgent.cs`

Initialization and Resetting

`RollerAgent.cs`

Observing the Environment

Actions and Rewards

In summary, RollerAgents.cs looks like this:

`RollerAgents.cs`

Finish on the Unity editor

Manual test

Learn with TensorFlow

Environment construction / library installation

Reinforcement Learning

Learning reflection

(Bonus) Observe the transition graph with TensorBoard

Summary

reference