In the previous article, I summarized how to create an environment for executing Python code in C ++. Run Python in C ++ on Visual Studio 2017 I don't think you can understand ** thank you ** for calling Python in C ++ from this article alone. Therefore, I would like to execute YOLO v3 that can detect objects by deep learning. Deep learning models can be used in C ++, but they are not as good as the current Python (crying).
・ OS: windos10 64bit -CPU: Intel i3-8100 ・ GPU: NVIDIA GeForce GTX 1050 Ti ・ Visual Studio 2017 ・ C ++ -Python 3.7.3 Previous article has higher specs than the environment ().
The following two environment constructions are required as prerequisites. We will work on each construction after it is completed. -Refer to the following article for how to build an environment to call Python in C ++. Run Python in C ++ on Visual Studio 2017 -Please check the following article for how to build an environment that allows you to use GPU with Python. Preferences tensorflow-gpu
I would like to proceed with the following flow. If you like, please join us. Those who came from previous articles can skip some of the steps.
I want to extract the files directly under the C drive.
URL
git clone https://github.com/yusa0827/200121_Cplus2_with_Python
The method is 2.
Looking at the contents of the DL folder, there is a .sln file. Because the sample program is included in .sln Double click here or Right click ⇒ Open with program ⇒ Click Microsoft Visual Studio 2017 Then, you can start the sample program in visual studio. This program is built with ver2017. I think it can be used in 2019, but within 2019 I have to drop it to the 2017 version.
You need to modify the environment path.
1.Solution configuration and solution platform
Changed from Debug to Release
x86 → x64
2. C++→ General → Additional include
C:\boost_1_70_0
C:\Users\○○\AppData\Local\Programs\Python\Python37\include
↑ Correction required
3. C++→ Code generation → With runtime library
Multithread(/Change to MT)
4.Linker → General → Additional library directory
C:\boost_1_70_0\stage\lib\x64
C:\Users\○○\AppData\Local\Programs\Python\Python37\libs
↑ Correction required
The execution result is as follows.
If that doesn't work, check your environment path.
YOLOv3 is an object detection method that uses deep learning, and is characterized by its excellent real-time performance. This time, I will use the commonly used keras version. Many people are teaching how to use it, so if you google it, it will be one shot, but I will describe it.
URL
git clone https://github.com/qqwweee/keras-yolo3.git
Go to that directory. (Cd keras-yolov3) Required modules for Python include Tensorflow, Keras, Matplotlib, Pillow, opencv (installed with opencv-python). If you haven't installed it on python yet, please install it with pip.
You can also download directly from the following URL without using wget. After downloading, put it in the keras-yolo3 folder. File name: yolov3.weights Size: 237MB
URL
wget https://pjreddie.com/media/files/yolov3.weights
Change to keras version. Enter the following code at the command prompt.
python convert.py yolov3.cfg yolov3.weights model_data/yolo.h5
The execution result is as follows.
:
:
conv2d_75 (Conv2D) (None, None, None, 2 65535 leaky_re_lu_72[0][0]
==================================================================================================
Total params: 62,001,757
Trainable params: 61,949,149
Non-trainable params: 52,608
__________________________________________________________________________________________________
None
Saved Keras model to model_data/yolo.h5
Read 62001757 of 62001757.0 from Darknet weights.
C:\demo_Cplus2_Py_YOLOv3\keras-yolo3>
This time we will use a webcam. You can use a video instead of a camera, but I chose this because the camera was easier to verify.
Easily edit yolo.py, the main code for object detection. It is around the 173rd line.
yolo.py
import cv2
vid = cv2.VideoCapture(video_path)
#↓ Corrected below
import cv2
#vid = cv2.VideoCapture(video_path)
vid = cv2.VideoCapture(0)
If you give 0 to the argument of VideoCapture, you have selected the camera device. Edit the code and run YOLO V3.
Execution code.
cmd
python yolo_video.py
Execution result. Doraemon seems to be a sports ball.
I was able to confirm whether YOLO v3 works. Next, I will devise to pull the Pyhton code in C ++.
In order to call YOLOv3 from C ++, we need to devise some things. One of them is to define the object generated from the YOLO class in C ++. Normally, object detection should be done in Python, but if you do not create an object, you will have to call Tensorflow every time you detect an object, which will cause a large delay. It takes about 15 seconds on my PC to launch Tensorflow. Therefore, by creating a YOLO object in advance, you can prevent each call.
Modify the existing yolo.py.
Detect objects with a webcam.
Prepare the git cloned visul stusio file and keras-yolo3 file. Copy only the keras-yolo3 files required for object detection to the visul stusio file. The configuration is as follows. Modify the files marked with a circle. In addition, the files required for YOLO v3 have been placed directly under the C drive.
C drive ── model_data
│ ├── yolo.h5 ← keras-yolo3 model_Exists in the data folder
│ ├── yolo_anchors.txt ← keras-yolo3 model_Exists in the data folder
│ ├── coco_classes.txt ← keras-yolo3 model_Exists in the data folder
│ └── FiraMono-Medium.otf ← keras-Exists in the font folder of yolo3
│
└─ 200121_Cplus2_with_Python
├── test_Cplus2_with_Python
│ ├── test_Cplus2_with_Python.cpp 〇
│ ├── x64
│ └── others
├── x64
│ └── Release
│ ├── test_Cplus2_with_Python.exe
│ ├── yolo3 ← keras-Exists in the yolo3 folder
│ ├── yolo.py 〇 ← keras-Exists in the yolo3 folder
│ └── others
├── (others( .git .vs))
└── test_Cplus2_with_Python.sln
There are two circles in the file structure. It is the main code part of C ++ and Python, respectively. Modify each as follows.
I made some corrections to the previous article. It defines Python py files, py file functions, objects, etc. Basically, it is an auto type and C ++ decides it. Before turning with while, the YOLO object is defined in advance. In while, only execute Pytohn's object detection function. If you want to add other original processing, insert the code in a suitable place.
test_Cplus2_with_Python.cpp
#define BOOST_PYTHON_STATIC_LIB
#define BOOST_NUMPY_STATIC_LIB
#include <iostream>
#include <boost/python.hpp>
//Define namespace
namespace py = boost::python;
/*YOLO v3 C++Run on*/
int main()
{
//Initialize Python
Py_Initialize();
//YOLO v3 py file(yolo.py)Import
py::object YOLOv3 = py::import("yolo").attr("__dict__");
//yolo.In py"object_YOLOv3"Define function
auto object_YOLOv3 = YOLOv3["object_YOLOv3"];
//object_Define object variables in YOLOv3 functions
py::object object_YOLOv3_init;
//object_Initialize object variables in YOLOv3 function
auto object_YOLOv3_maker = object_YOLOv3(object_YOLOv3_init);
//Define a function for object detection
auto insert_object_YOLOv3 = YOLOv3["insert_object_YOLOv3"];
//Observed values
double py_y;
/*Real-time object detection by YOLO v3*/
while (true) {
//X-axis displacement of the center of the object detected by deep learning
auto x_centor = insert_object_YOLOv3(object_YOLOv3_maker);
//Displacement C++Convert to a type that can be used in
py_y = py::extract<double>(x_centor);
/*
If you want to process other things, describe appropriately
*/
//comment
std::cout << "py_y = " << py_y << std::endl;
}
}
In order to get the image from the webcam, the object that opened the webcam is defined in the initialization (init) of the YOLO class. In addition, the center of the horizontal axis that detected the object is calculated and returned. In addition, "1_function for object initialization" and "2_function for object detection" are newly added.
yolo.py
# -*- coding: utf-8 -*-
"""
Class definition of YOLO_v3 style detection model on image and video
"""
import colorsys
import os
from timeit import default_timer as timer
import numpy as np
from keras import backend as K
from keras.models import load_model
from keras.layers import Input
from PIL import Image, ImageFont, ImageDraw
from yolo3.model import yolo_eval, yolo_body, tiny_yolo_body
from yolo3.utils import letterbox_image
import os
from keras.utils import multi_gpu_model
#add to
import cv2
#Added: Limit on GPU memory usage of TensorFlow
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.3
set_session(tf.Session(config=config))
#Add global variable definition
model_path_ = 'C:/model_data/yolo.h5'
anchors_path_ = 'C:/model_data/yolo_anchors.txt'
classes_path_ = 'C:/model_data/coco_classes.txt'
font_path_ = 'C:/model_data/FiraMono-Medium.otf'
class YOLO(object):
_defaults = {
"model_path": model_path_,#change point
"anchors_path": anchors_path_,#change point
"classes_path": classes_path_,#change point
"score" : 0.3,
"iou" : 0.45,
"model_image_size" : (416, 416),
"gpu_num" : 1,
}
@classmethod
def get_defaults(cls, n):
if n in cls._defaults:
return cls._defaults[n]
else:
return "Unrecognized attribute name '" + n + "'"
def __init__(self, **kwargs):
self.__dict__.update(self._defaults) # set up default values
self.__dict__.update(kwargs) # and update with user overrides
self.class_names = self._get_class()
self.anchors = self._get_anchors()
self.sess = K.get_session()
self.boxes, self.scores, self.classes = self.generate()
#Added camera open
self.cap = cv2.VideoCapture(0)
def _get_class(self):
classes_path = os.path.expanduser(self.classes_path)
with open(classes_path) as f:
class_names = f.readlines()
class_names = [c.strip() for c in class_names]
return class_names
def _get_anchors(self):
anchors_path = os.path.expanduser(self.anchors_path)
with open(anchors_path) as f:
anchors = f.readline()
anchors = [float(x) for x in anchors.split(',')]
return np.array(anchors).reshape(-1, 2)
def generate(self):
model_path = os.path.expanduser(self.model_path)
assert model_path.endswith('.h5'), 'Keras model or weights must be a .h5 file.'
# Load model, or construct model and load weights.
num_anchors = len(self.anchors)
num_classes = len(self.class_names)
is_tiny_version = num_anchors==6 # default setting
try:
self.yolo_model = load_model(model_path, compile=False)
except:
self.yolo_model = tiny_yolo_body(Input(shape=(None,None,3)), num_anchors//2, num_classes) \
if is_tiny_version else yolo_body(Input(shape=(None,None,3)), num_anchors//3, num_classes)
self.yolo_model.load_weights(self.model_path) # make sure model, anchors and classes match
else:
assert self.yolo_model.layers[-1].output_shape[-1] == \
num_anchors/len(self.yolo_model.output) * (num_classes + 5), \
'Mismatch between model and given anchor and class sizes'
print('{} model, anchors, and classes loaded.'.format(model_path))
# Generate colors for drawing bounding boxes.
hsv_tuples = [(x / len(self.class_names), 1., 1.)
for x in range(len(self.class_names))]
self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
self.colors = list(
map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)),
self.colors))
np.random.seed(10101) # Fixed seed for consistent colors across runs.
np.random.shuffle(self.colors) # Shuffle colors to decorrelate adjacent classes.
np.random.seed(None) # Reset seed to default.
# Generate output tensor targets for filtered bounding boxes.
self.input_image_shape = K.placeholder(shape=(2, ))
if self.gpu_num>=2:
self.yolo_model = multi_gpu_model(self.yolo_model, gpus=self.gpu_num)
boxes, scores, classes = yolo_eval(self.yolo_model.output, self.anchors,
len(self.class_names), self.input_image_shape,
score_threshold=self.score, iou_threshold=self.iou)
return boxes, scores, classes
#Correction C++Correction of object detection position for
def detect_image_for_Cplus2(self, image):
if self.model_image_size != (None, None):
assert self.model_image_size[0]%32 == 0, 'Multiples of 32 required'
assert self.model_image_size[1]%32 == 0, 'Multiples of 32 required'
boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size)))
else:
new_image_size = (image.width - (image.width % 32),
image.height - (image.height % 32))
boxed_image = letterbox_image(image, new_image_size)
image_data = np.array(boxed_image, dtype='float32')
image_data /= 255.
image_data = np.expand_dims(image_data, 0) # Add batch dimension.
out_boxes, out_scores, out_classes = self.sess.run(
[self.boxes, self.scores, self.classes],
feed_dict={
self.yolo_model.input: image_data,
self.input_image_shape: [image.size[1], image.size[0]],
K.learning_phase(): 0
})
font = ImageFont.truetype(font=font_path_,
size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32'))
thickness = (image.size[0] + image.size[1]) // 300
#X displacement of the object
self.x_centor = .0
for i, c in reversed(list(enumerate(out_classes))):
predicted_class = self.class_names[c]
box = out_boxes[i]
score = out_scores[i]
label = '{} {:.2f}'.format(predicted_class, score)
draw = ImageDraw.Draw(image)
label_size = draw.textsize(label, font)
#There is a position of the center of gravity around here
top, left, bottom, right = box
top = max(0, np.floor(top + 0.5).astype('int32'))
left = max(0, np.floor(left + 0.5).astype('int32'))
bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))
right = min(image.size[0], np.floor(right + 0.5).astype('int32'))
print(label, (left, top), (right, bottom))
#x-axis center of gravity x_centor = ( x1 + x2 ) / 2
self.x_centor = ( left + right ) / 2.
if top - label_size[1] >= 0:
text_origin = np.array([left, top - label_size[1]])
else:
text_origin = np.array([left, top + 1])
# My kingdom for a good redistributable image drawing library.
for i in range(thickness):
draw.rectangle(
[left + i, top + i, right - i, bottom - i],
outline=self.colors[c])
draw.rectangle(
[tuple(text_origin), tuple(text_origin + label_size)],
fill=self.colors[c])
draw.text(text_origin, label, fill=(0, 0, 0), font=font)
del draw
return image, self.x_centor
def close_session(self):
self.sess.close()
# 1_Function for object initialization
def object_YOLOv3(object_YOLO):
#Create an object from a class
object_YOLO = YOLO()
#C++Returns a yolo object to
return object_YOLO
# 2_Function for object detection
def insert_object_YOLOv3(object_YOLO):
#Get image from camera
ret, frame = object_YOLO.cap.read()
#Change the order of RGGB
frame = np.asarray(frame)[..., ::-1]
#Change from opencv to pillow
frame = Image.fromarray(frame)
#Detects an object and returns the output result and the center of the object's x-axis
r_image, x_centor = object_YOLO.detect_image_for_Cplus2(frame)
#Image display
cv2.imshow("out",np.asarray(r_image)[..., ::-1])
#1ms to display
cv2.waitKey(1)
#C++Returns the center of the x-axis of the object
return x_centor
Since it uses a trained model, it is detected as person etc.
Since it is a sample program, even if you want to use the original model, you can use it just by changing the path in the py file.
When do I have to call Python in C ++ ... I think some people are worried. The reason is -When you really want to incorporate deep learning processing in a C ++ compliant device (motion control board, etc.) ・ A favorite engineer (eccentric) who can't leave C ++ ・ Anyway, weirdo That's all for the joke.
We sincerely hope that it will be useful for those who are worried about environmental dependence between program languages.
Recommended Posts