[PYTHON] I tried to move machine learning (ObjectDetection) with TouchDesigner

Introduction

While studying, I wanted to move the model I made appropriately in Touch Designer, so I tried it.

When I actually try it, it looks like this! (It's different from the model used in this article, but it looks like this)

https://www.youtube.com/watch?v=ycBvOQUzisU

The sample repository is here!

By the way, this time I wrote an article on the premise of Windows, but it can also work on Mac!

About Python in Touch Designer

TouchDesigner allows you to use Python as an extension (Python 3.7.2 seems to work internally in my environment).

You can use the Python interpreter from Dialogs> Textport and DATs.

As a pre-install, you can use standard library, OpenCV, numpy, etc.

touch1.png

This time, in addition to this, I would like to use a library that is often used in machine learning to perform Object Detection within Touch Desginer.

The way to do it is to specify the path to the external library in "Add External Python to Search Path" of Edit> Preference so that it can be used in Touch Designer.

About Object Detection

I won't explain ObjectDetection in detail, but it's like detecting an object such as a person or thing like the image below (enclosed in a square to indicate what this is).

result.png

This time, I will try using MobileNet v2-SSD, which works on mobile terminals, with the spirit of trying to earn FPS by adopting an algorithm that is as light as possible!

As a library (framework) that handles machine learning (mainly DNN) in Python,

Something exists.

I usually deal with ** Tensorflow / Keras ** (sometimes I also use Pytorch), but this time I use ONNX. (To be exact, it uses an inference environment called ONNX Runtime)

The reason is that in my environment, when I use Tensorflow or Pytorch from Python in TouchDesigner, TouchDesigner crashes without throwing an error message.

After that, I tried ONNX with no use and it worked, so I will use this! (I don't know MXNet ...)

For the essential model, we prepared a model created with Tensorflow and exported to ONNX format. (Included in Repository)

Work environment

1. Environment construction

Once you have a working environment, try installing ONNX!

1-1. Create a virtual environment with Anaconda

$conda create -n touchdesigner python=3.7.2

1-2. Install ONNX in the virtual environment

$source activate touchdesigner
$pip install onnxruntime==1.1.0

1-3. Pass the virtual environment path through Touch Designer

Click Edit> Preferences to open Preferences.

Check "Add External Python to Search Path" as shown in the image below, and pass the path to the virtual environment of Anaconda in "Python 64-bit Module Path".

touch2.png

In my environment, I would go through the following path.

C:/Users/T-Sumida/Anaconda3/envs/touchdesigner/Lib/site-packages

Please refer to this and try through the path!

2. Preparing ONNX

Now let's check that ONNX works on Touch Desiger!

Launch the Python interpreter from Dialogs> Textport and DATs, paste the following code into it, and run it.

** * Please set the path to the model and the path of the image you want to try appropriately. ** **

import cv2
import numpy as np
import onnxruntime

#coco dataset class number: class name
coco_classes = {
    1: 'person',
    2: 'bicycle',
    3: 'car',
    4: 'motorcycle',
    5: 'airplane',
    6: 'bus',
    7: 'train',
    8: 'truck',
    9: 'boat',
    10: 'traffic light',
    11: 'fire hydrant',
    12: 'stop sign',
    13: 'parking meter',
    14: 'bench',
    15: 'bird',
    16: 'cat',
    17: 'dog',
    18: 'horse',
    19: 'sheep',
    20: 'cow',
    21: 'elephant',
    22: 'bear',
    23: 'zebra',
    24: 'giraffe',
    25: 'backpack',
    26: 'umbrella',
    27: 'handbag',
    28: 'tie',
    29: 'suitcase',
    30: 'frisbee',
    31: 'skis',
    32: 'snowboard',
    33: 'sports ball',
    34: 'kite',
    35: 'baseball bat',
    36: 'baseball glove',
    37: 'skateboard',
    38: 'surfboard',
    39: 'tennis racket',
    40: 'bottle',
    41:'wine glass',
    42: 'cup',
    43: 'fork',
    44: 'knife',
    45: 'spoon',
    46: 'bowl',
    47: 'banana',
    48: 'apple',
    49: 'sandwich',
    50: 'orange',
    51: 'broccoli',
    52: 'carrot',
    53: 'hot dog',
    54: 'pizza',
    55: 'donut',
    56: 'cake',
    57: 'chair',
    58: 'couch',
    59: 'potted plant',
    60: 'bed',
    61: 'dining table',
    62: 'toilet',
    63: 'tv',
    64: 'laptop',
    65: 'mouse',
    66: 'remote',
    67: 'keyboard',
    68: 'cell phone',
    69: 'microwave',
    70: 'oven',
    71: 'toaster',
    72: 'sink',
    73: 'refrigerator',
    74: 'book',
    75: 'clock',
    76: 'vase',
    77: 'scissors',
    78: 'teddy bear',
    79: 'hair drier',
    80: 'toothbrush'
}

#Load the model
session = onnxruntime.InferenceSession("Specify the path to the ONNX model")

#Load image
img = cv2.imread("Specify the path of the image you want to try")

#OpenCV reads image data with BGR, so convert it to BGR
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

width, height = img.shape[0], img.shape[1]

img_data = np.expand_dims(img, axis=0)

#Model inference preparation
input_name = session.get_inputs()[0].name   # 'image'
output_name_boxes = session.get_outputs()[0].name     # 'boxes'
output_name_classes = session.get_outputs()[1].name   # 'classes'
output_name_scores = session.get_outputs()[2].name    # 'scores'
output_name_num = session.get_outputs()[3].name       # 'number of detections'

#inference
outputs_index = session.run(
    [output_name_num, output_name_boxes, output_name_scores, output_name_classes],
    {input_name: img_data}
)

#Receive results
output_num = outputs_index[0] #Number of detected objects
output_boxes = outputs_index[1] #A box showing the location of the detected object
output_scores = outputs_index[2] #Prediction probability of detected object
output_classes = outputs_index[3] #Class number of detected object

#Prediction probability threshold
threshold = 0.6

#Convert inference results to images
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
for detection in range(0, int(output_num[0])):
    if output_scores[0][detection] > threshold:
        classes = output_classes[0][detection]
        boxes = output_boxes[0][detection]
        scores = output_scores[0][detection]
        top = boxes[0] * width
        left = boxes[1] * height
        bottom = boxes[2] * width
        right = boxes[3] * height

        top = max(0, top)
        left = max(0, left)
        bottom = min(width, bottom)
        right = min(height, right)
        img = cv2.rectangle(img, (int(left), int(top)), (int(right), int(bottom)), (0,0,255), 3)
        img = cv2.putText(
        	img, "{}: {:.2f}".format(coco_classes[classes], scores),
        	(int(left), int(top)),
        	cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2, cv2.LINE_AA
        )

cv2.imshow('img', img)
k = cv2.waitKey(0)

When you execute it, the following image will be drawn.

result.png

If this works fine, it's okay!

3. Write Python with Touch Designer

Next, I will make it work with the Touch Desiger patch.

I will refer to this article (Thank you!) And make Object Detection for the input from the camera image.

https://qiita.com/komakinex/items/5b84b88d537d393afc98

The project itself is very simple, and I made it something that receives the input from "Video Device In" with "OP Execute" and draws it with the Opencv function.

The contents of "OP Execute" are as follows.

# me - this DAT.
# changeOp - the operator that has changed
#
# Make sure the corresponding toggle is enabled in the OP Execute DAT.

import cv2
import numpy as np
import onnxruntime

#Since it is the same as the above test, it is omitted (when using it, copy it from the top and use it)
coco_classes = {
    1: 'person',
    2: 'bicycle',
    ...
}

session = onnxruntime.InferenceSession("C:/Users/TomoyukiSumida/Documents/Hatena/ObjetDetection4TD/ssdlite_mobilenetv2.onnx")

def onPreCook(changeOp):
	return

def onPostCook(changeOp):
	#Loading images
	frame = changeOp.numpyArray(delayed=True)
	arr = frame[:, :, 0:3]
	arr = arr * 255
	arr = arr.astype(np.uint8)
	arr = np.flipud(arr)
	width, height = arr.shape[0:2]
	image_data = np.expand_dims(arr, axis=0)

	#Model inference preparation
	input_name = session.get_inputs()[0].name   # 'image'
	output_name_boxes = session.get_outputs()[0].name     # 'boxes'
	output_name_classes = session.get_outputs()[1].name   # 'classes'
	output_name_scores = session.get_outputs()[2].name    # 'scores'
	output_name_num = session.get_outputs()[3].name       # 'number of detections'

	#inference
	outputs_index = session.run([output_name_num, output_name_boxes,
                                output_name_scores, output_name_classes],
                                {input_name: image_data})

	#Receive results
	output_num = outputs_index[0] #Number of detected objects
	output_boxes = outputs_index[1] #A box showing the location of the detected object
	output_scores = outputs_index[2] #Prediction probability of detected object
	output_classes = outputs_index[3] #Class number of detected object

	#Prediction probability threshold
	threshold = 0.6

	#Convert inference results to images
	for detection in range(0, int(output_num[0])):
		if output_scores[0][detection] > threshold:
			classes = output_classes[0][detection]
			boxes = output_boxes[0][detection]
			scores = output_scores[0][detection]
			top = boxes[0] * width
			left = boxes[1] * height
			bottom = boxes[2] * width
			right = boxes[3] * height

			top = max(0, top)
			left = max(0, left)
			bottom = min(width, bottom)
			right = min(height, right)
			arr = cv2.rectangle(arr, (int(left), int(top)), (int(right), int(bottom)), (0,0,255), 3)
			arr = cv2.putText(
				arr, "{}: {:.2f}".format(coco_classes[classes], scores),
				(int(left), int(top)),
				cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2, cv2.LINE_AA)
	arr = cv2.cvtColor(arr, cv2.COLOR_RGB2BGR)
	cv2.imshow('img', arr)
	return

def onDestroy():
	return

def onFlagChange(changeOp, flag):
	return

def onWireChange(changeOp):
	return

def onNameChange(changeOp):
	return

def onPathChange(changeOp):
	return

def onUIChange(changeOp):
	return

def onNumChildrenChange(changeOp):
	return

def onChildRename(changeOp):
	return

def onCurrentChildChange(changeOp):
	return

def onExtensionChange(changeOp, extension):
	return


If you can write this, specify "Video Device In" in Monitor OPs of "OP Execute" and turn on Post Cook.

**moved! ** **

example.png

in conclusion

This time I tried to move Object Detection using ONNX in Touch Designer. I was in trouble because Tensorflow didn't work for some reason, but I'm glad I managed to make something that works.

However, running Object Detection (and machine learning) within TouchDesigner has the following disadvantages. --Since TouchDesigner is single thread, FPS of the whole project is limited --I have confirmed that only models converted to ONNX can be used (some other good algorithms cannot be used).

Therefore, it is better to start a Python process behind TouchDesigner and send information from there with OSC (it may be useful if multithreading can be used with TouchDesigner in the future).

I also tried the ONNX model with my own model this time, but it is okay to download the model file from onnx / model and use it. (I also downloaded mask-rcnn from there and confirmed that it works)

In addition, high-speed inference using the GPU is also possible. (This model used one that runs at high speed even with a CPU)

$pip install onnxruntime==1.1.0

Instead of this

$pip install onnxruntime-gpu==1.1.0

If you install this, inference will be done on the GPU. Is it interesting to try various things?

The expression side has been difficult to reach out, but I think I should touch on that area in the future!

Recommended Posts

I tried to move machine learning (ObjectDetection) with TouchDesigner
I tried machine learning with liblinear
I tried to move GAN (mnist) with keras
(Machine learning) I tried to understand Bayesian linear regression carefully with implementation.
I tried to visualize the model with the low-code machine learning library "PyCaret"
I tried to move Faster R-CNN quickly with pytorch
I tried to compress the image using machine learning
I tried learning LightGBM with Yellowbrick
I tried to move the ball
Uncle SE with hardened brain tried to study machine learning
I tried machine learning to convert sentences into XX style
Mayungo's Python Learning Episode 3: I tried to print numbers with print
I tried to implement ListNet of rank learning with Chainer
[Machine learning] I tried to summarize the theory of Adaboost
I tried to divide with a deep learning language model
I tried to make a real-time sound source separation mock with Python machine learning
I tried to build an environment for machine learning with Python (Mac OS X)
I tried to implement Autoencoder with TensorFlow
I tried to visualize AutoEncoder with TensorFlow
I tried to make deep learning scalable with Spark × Keras × Docker
I tried to get started with Hy
I installed Python 3.5.1 to study machine learning
I tried learning with Kaggle's Titanic (kaggle②)
I tried to implement CVAE with PyTorch
I tried to solve TSP with QAOA
[Machine learning] I tried to do something like passing an image
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Introduction ~
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Implementation ~
I tried to implement deep learning that is not deep with only NumPy
Mayungo's Python Learning Episode 2: I tried to put out characters with variables
I tried to classify guitar chords in real time using machine learning
A beginner of machine learning tried to predict Arima Kinen with python
I started machine learning with Python (I also started posting to Qiita) Data preparation
I tried to predict next year with AI
[Python] Easy introduction to machine learning with python (SVM)
I tried to detect Mario with pytorch + yolov3
I tried to implement reading Dataset with PyTorch
I tried to use lightGBM, xgboost with Boruta
I tried to learn logical operations with TF Learn
I tried to save the data with discord
I tried to detect motion quickly with OpenCV
I tried to integrate with Keras in TFv1.1
I tried to get CloudWatch data with Python
I tried to output LLVM IR with Python
I tried to detect an object with M2Det!
I tried to automate sushi making with python
I tried to predict Titanic survival with PyCaret
I started machine learning with Python Data preprocessing
I tried to operate Linux with Discord Bot
I tried to study DP with Fibonacci sequence
I tried to start Jupyter with Amazon lightsail
I tried to judge Tsundere with Naive Bayes
[Mac] I tried reinforcement learning with OpenAI Baselines
(Machine learning) I tried to understand the EM algorithm in a mixed Gaussian distribution carefully with implementation.
I tried deep learning
I tried to debug.
I tried to paste
Introduction to machine learning
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Battle Edition ~
I tried to organize the evaluation indexes used in machine learning (regression model)
Mayungo's Python Learning Episode 5: I tried to do four arithmetic operations with numbers