Run Python YOLOv3 in C ++ on Visual Studio 2017

background

In the previous article, I summarized how to create an environment for executing Python code in C ++. Run Python in C ++ on Visual Studio 2017 I don't think you can understand ** thank you ** for calling Python in C ++ from this article alone. Therefore, I would like to execute YOLO v3 that can detect objects by deep learning. Deep learning models can be used in C ++, but they are not as good as the current Python (crying).

Development environment

・ OS: windos10 64bit -CPU: Intel i3-8100 ・ GPU: NVIDIA GeForce GTX 1050 Ti ・ Visual Studio 2017 ・ C ++ -Python 3.7.3 Previous article has higher specs than the environment ().

procedure

The following two environment constructions are required as prerequisites. We will work on each construction after it is completed. -Refer to the following article for how to build an environment to call Python in C ++. Run Python in C ++ on Visual Studio 2017 -Please check the following article for how to build an environment that allows you to use GPU with Python. Preferences tensorflow-gpu

I would like to proceed with the following flow. If you like, please join us. Those who came from previous articles can skip some of the steps.

Extract the git file directly under the C drive

I want to extract the files directly under the C drive.

`URL`


git clone https://github.com/yusa0827/200121_Cplus2_with_Python

The method is 2.

If you can use git clone, create an arbitrary folder directly under the C drive and git clone it.
If you cannot use git clone, jump to the above URL, download it, and place it directly under the C drive.

図1.png

Launch .sln in file in visual studio 2017

Looking at the contents of the DL folder, there is a .sln file. Because the sample program is included in .sln Double click here or Right click ⇒ Open with program ⇒ Click Microsoft Visual Studio 2017 Then, you can start the sample program in visual studio. This program is built with ver2017. I think it can be used in 2019, but within 2019 I have to drop it to the 2017 version.

図3.png

Setting the environment path of the sample program

You need to modify the environment path.

1.Solution configuration and solution platform
Changed from Debug to Release
x86 → x64

2. C++→ General → Additional include
C:\boost_1_70_0
C:\Users\○○\AppData\Local\Programs\Python\Python37\include　
↑ Correction required

3. C++→ Code generation → With runtime library
Multithread(/Change to MT)

4.Linker → General → Additional library directory
C:\boost_1_70_0\stage\lib\x64
C:\Users\○○\AppData\Local\Programs\Python\Python37\libs
↑ Correction required

Run the sample program

The execution result is as follows. 図4.png

If that doesn't work, check your environment path.

Preparation for YOLO v3

YOLOv3 is an object detection method that uses deep learning, and is characterized by its excellent real-time performance. This time, I will use the commonly used keras version. Many people are teaching how to use it, so if you google it, it will be one shot, but I will describe it.

Installation of keras version of yolov3

`URL`


git clone https://github.com/qqwweee/keras-yolo3.git

Go to that directory. (Cd keras-yolov3) Required modules for Python include Tensorflow, Keras, Matplotlib, Pillow, opencv (installed with opencv-python). If you haven't installed it on python yet, please install it with pip.

Download learned file

You can also download directly from the following URL without using wget. After downloading, put it in the keras-yolo3 folder. File name: yolov3.weights Size: 237MB

`URL`


wget https://pjreddie.com/media/files/yolov3.weights

Convert learned files

Change to keras version. Enter the following code at the command prompt.

python convert.py yolov3.cfg yolov3.weights model_data/yolo.h5

The execution result is as follows.

:
:
conv2d_75 (Conv2D)              (None, None, None, 2 65535       leaky_re_lu_72[0][0]
==================================================================================================
Total params: 62,001,757
Trainable params: 61,949,149
Non-trainable params: 52,608
__________________________________________________________________________________________________
None
Saved Keras model to model_data/yolo.h5
Read 62001757 of 62001757.0 from Darknet weights.

C:\demo_Cplus2_Py_YOLOv3\keras-yolo3>

Real-time object detection with webcam

This time we will use a webcam. You can use a video instead of a camera, but I chose this because the camera was easier to verify.

Easily edit yolo.py, the main code for object detection. It is around the 173rd line.

`yolo.py`


    import cv2
    vid = cv2.VideoCapture(video_path)
    
    #↓ Corrected below

    import cv2
    #vid = cv2.VideoCapture(video_path)
    vid = cv2.VideoCapture(0)

If you give 0 to the argument of VideoCapture, you have selected the camera device. Edit the code and run YOLO V3.

Execution of YOLO v3 and its results

Execution code.

`cmd`


python yolo_video.py

Execution result. Doraemon seems to be a sports ball. 図5.png

I was able to confirm whether YOLO v3 works. Next, I will devise to pull the Pyhton code in C ++.

Pulling YOLO v3 in C ++

In order to call YOLOv3 from C ++, we need to devise some things. One of them is to define the object generated from the YOLO class in C ++. Normally, object detection should be done in Python, but if you do not create an object, you will have to call Tensorflow every time you detect an object, which will cause a large delay. It takes about 15 seconds on my PC to launch Tensorflow. Therefore, by creating a YOLO object in advance, you can prevent each call.

Modify the existing yolo.py.

Program contents

Detect objects with a webcam.

Call Python in C ++
Insert the image obtained from the webcam into the deep learning object detector YOLO v3.
Find the central axis of the detected object on the horizontal axis (x axis) and return it to C ++
Output the central axis returned from Python in C ++ If you don't have a webcam, you can specify cv2.VideoCapture ("video path") in yolo.py.

file organization

Prepare the git cloned visul stusio file and keras-yolo3 file. Copy only the keras-yolo3 files required for object detection to the visul stusio file. The configuration is as follows. Modify the files marked with a circle. In addition, the files required for YOLO v3 have been placed directly under the C drive.

C drive ── model_data
　　　　   │　 ├── yolo.h5 ← keras-yolo3 model_Exists in the data folder
　　　　   │　 ├── yolo_anchors.txt ← keras-yolo3 model_Exists in the data folder
　　　　   │　 ├── coco_classes.txt ← keras-yolo3 model_Exists in the data folder
　　　　   │　 └── FiraMono-Medium.otf ← keras-Exists in the font folder of yolo3
　　　　   │　 
　　　　   └─ 200121_Cplus2_with_Python
　　　　     　├── test_Cplus2_with_Python
　　　　     　│　　├── test_Cplus2_with_Python.cpp 〇
　　　　     　│　　├── x64
　　　　     　│　　└── others
　　　　     　├── x64
　　　　     　│　　└── Release
　　　　     　│　　     ├── test_Cplus2_with_Python.exe
　　　　     　│　　     ├── yolo3 ← keras-Exists in the yolo3 folder
　　　　     　│　　     ├── yolo.py 〇 ← keras-Exists in the yolo3 folder
　　　　     　│　　     └── others
　　　　     　├── (others( .git .vs))
　　　　     　└── test_Cplus2_with_Python.sln

Main code fix

There are two circles in the file structure. It is the main code part of C ++ and Python, respectively. Modify each as follows.

C ++ main code

I made some corrections to the previous article. It defines Python py files, py file functions, objects, etc. Basically, it is an auto type and C ++ decides it. Before turning with while, the YOLO object is defined in advance. In while, only execute Pytohn's object detection function. If you want to add other original processing, insert the code in a suitable place.

`test_Cplus2_with_Python.cpp`


#define BOOST_PYTHON_STATIC_LIB
#define BOOST_NUMPY_STATIC_LIB

#include <iostream>
#include <boost/python.hpp>

//Define namespace
namespace py = boost::python;

/*YOLO v3 C++Run on*/
int main()
{	
	//Initialize Python
	Py_Initialize();
	//YOLO v3 py file(yolo.py)Import
	py::object YOLOv3 = py::import("yolo").attr("__dict__");
	//yolo.In py"object_YOLOv3"Define function
	auto object_YOLOv3 = YOLOv3["object_YOLOv3"];
	//object_Define object variables in YOLOv3 functions
	py::object object_YOLOv3_init;
	//object_Initialize object variables in YOLOv3 function
	auto object_YOLOv3_maker = object_YOLOv3(object_YOLOv3_init);
	//Define a function for object detection
	auto insert_object_YOLOv3 = YOLOv3["insert_object_YOLOv3"];
	//Observed values
	double py_y;
	
	/*Real-time object detection by YOLO v3*/
	while (true) {

		//X-axis displacement of the center of the object detected by deep learning
		auto x_centor = insert_object_YOLOv3(object_YOLOv3_maker);
		//Displacement C++Convert to a type that can be used in
		py_y = py::extract<double>(x_centor);

		/*
If you want to process other things, describe appropriately
		*/

		//comment
		std::cout << "py_y = " << py_y << std::endl;

	}
}

Python main code

In order to get the image from the webcam, the object that opened the webcam is defined in the initialization (init) of the YOLO class. In addition, the center of the horizontal axis that detected the object is calculated and returned. In addition, "1_function for object initialization" and "2_function for object detection" are newly added.

`yolo.py`


# -*- coding: utf-8 -*-
"""
Class definition of YOLO_v3 style detection model on image and video
"""
import colorsys
import os
from timeit import default_timer as timer
import numpy as np
from keras import backend as K
from keras.models import load_model
from keras.layers import Input
from PIL import Image, ImageFont, ImageDraw
from yolo3.model import yolo_eval, yolo_body, tiny_yolo_body
from yolo3.utils import letterbox_image
import os
from keras.utils import multi_gpu_model

#add to
import cv2
#Added: Limit on GPU memory usage of TensorFlow
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.3
set_session(tf.Session(config=config))
#Add global variable definition
model_path_ = 'C:/model_data/yolo.h5'
anchors_path_ = 'C:/model_data/yolo_anchors.txt'
classes_path_ = 'C:/model_data/coco_classes.txt'
font_path_ = 'C:/model_data/FiraMono-Medium.otf'

class YOLO(object):
    _defaults = {
        "model_path": model_path_,#change point
        "anchors_path": anchors_path_,#change point
        "classes_path": classes_path_,#change point
        "score" : 0.3,
        "iou" : 0.45,
        "model_image_size" : (416, 416),
        "gpu_num" : 1,
    }

    @classmethod
    def get_defaults(cls, n):
        if n in cls._defaults:
            return cls._defaults[n]
        else:
            return "Unrecognized attribute name '" + n + "'"

    def __init__(self, **kwargs):
        self.__dict__.update(self._defaults) # set up default values
        self.__dict__.update(kwargs) # and update with user overrides
        self.class_names = self._get_class()
        self.anchors = self._get_anchors()
        self.sess = K.get_session()
        self.boxes, self.scores, self.classes = self.generate()
        #Added camera open
        self.cap = cv2.VideoCapture(0)

    def _get_class(self):
        classes_path = os.path.expanduser(self.classes_path)
        with open(classes_path) as f:
            class_names = f.readlines()
        class_names = [c.strip() for c in class_names]
        return class_names

    def _get_anchors(self):
        anchors_path = os.path.expanduser(self.anchors_path)
        with open(anchors_path) as f:
            anchors = f.readline()
        anchors = [float(x) for x in anchors.split(',')]
        return np.array(anchors).reshape(-1, 2)

    def generate(self):
        model_path = os.path.expanduser(self.model_path)
        assert model_path.endswith('.h5'), 'Keras model or weights must be a .h5 file.'

        # Load model, or construct model and load weights.
        num_anchors = len(self.anchors)
        num_classes = len(self.class_names)
        is_tiny_version = num_anchors==6 # default setting
        try:
            self.yolo_model = load_model(model_path, compile=False)
        except:
            self.yolo_model = tiny_yolo_body(Input(shape=(None,None,3)), num_anchors//2, num_classes) \
                if is_tiny_version else yolo_body(Input(shape=(None,None,3)), num_anchors//3, num_classes)
            self.yolo_model.load_weights(self.model_path) # make sure model, anchors and classes match
        else:
            assert self.yolo_model.layers[-1].output_shape[-1] == \
                num_anchors/len(self.yolo_model.output) * (num_classes + 5), \
                'Mismatch between model and given anchor and class sizes'

        print('{} model, anchors, and classes loaded.'.format(model_path))

        # Generate colors for drawing bounding boxes.
        hsv_tuples = [(x / len(self.class_names), 1., 1.)
                      for x in range(len(self.class_names))]
        self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
        self.colors = list(
            map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)),
                self.colors))
        np.random.seed(10101)  # Fixed seed for consistent colors across runs.
        np.random.shuffle(self.colors)  # Shuffle colors to decorrelate adjacent classes.
        np.random.seed(None)  # Reset seed to default.

        # Generate output tensor targets for filtered bounding boxes.
        self.input_image_shape = K.placeholder(shape=(2, ))
        if self.gpu_num>=2:
            self.yolo_model = multi_gpu_model(self.yolo_model, gpus=self.gpu_num)
        boxes, scores, classes = yolo_eval(self.yolo_model.output, self.anchors,
                len(self.class_names), self.input_image_shape,
                score_threshold=self.score, iou_threshold=self.iou)
        return boxes, scores, classes

    #Correction C++Correction of object detection position for
    def detect_image_for_Cplus2(self, image):

        if self.model_image_size != (None, None):
            assert self.model_image_size[0]%32 == 0, 'Multiples of 32 required'
            assert self.model_image_size[1]%32 == 0, 'Multiples of 32 required'
            boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size)))
        else:
            new_image_size = (image.width - (image.width % 32),
                              image.height - (image.height % 32))
            boxed_image = letterbox_image(image, new_image_size)
        image_data = np.array(boxed_image, dtype='float32')

        image_data /= 255.
        image_data = np.expand_dims(image_data, 0)  # Add batch dimension.

        out_boxes, out_scores, out_classes = self.sess.run(
            [self.boxes, self.scores, self.classes],
            feed_dict={
                self.yolo_model.input: image_data,
                self.input_image_shape: [image.size[1], image.size[0]],
                K.learning_phase(): 0
            })
        
        font = ImageFont.truetype(font=font_path_,
                    size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32'))
        thickness = (image.size[0] + image.size[1]) // 300

        #X displacement of the object
        self.x_centor = .0

        for i, c in reversed(list(enumerate(out_classes))):           
                       
            predicted_class = self.class_names[c]
            box = out_boxes[i]
            score = out_scores[i]

            label = '{} {:.2f}'.format(predicted_class, score)
            draw = ImageDraw.Draw(image)
            label_size = draw.textsize(label, font)

            #There is a position of the center of gravity around here
            top, left, bottom, right = box
            top = max(0, np.floor(top + 0.5).astype('int32'))
            left = max(0, np.floor(left + 0.5).astype('int32'))
            bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))
            right = min(image.size[0], np.floor(right + 0.5).astype('int32'))
            print(label, (left, top), (right, bottom))

            #x-axis center of gravity x_centor = ( x1 + x2 ) / 2
            self.x_centor = ( left + right ) / 2.

            if top - label_size[1] >= 0:
                text_origin = np.array([left, top - label_size[1]])
            else:
                text_origin = np.array([left, top + 1])

            # My kingdom for a good redistributable image drawing library.
            for i in range(thickness):
                draw.rectangle(
                    [left + i, top + i, right - i, bottom - i],
                    outline=self.colors[c])
            draw.rectangle(
                [tuple(text_origin), tuple(text_origin + label_size)],
                fill=self.colors[c])
            draw.text(text_origin, label, fill=(0, 0, 0), font=font)
            del draw

        return image, self.x_centor

    def close_session(self):
        self.sess.close()

# 1_Function for object initialization
def object_YOLOv3(object_YOLO):
    #Create an object from a class
    object_YOLO = YOLO()
    #C++Returns a yolo object to
    return object_YOLO

# 2_Function for object detection
def insert_object_YOLOv3(object_YOLO):    
    #Get image from camera
    ret, frame = object_YOLO.cap.read()
    #Change the order of RGGB
    frame = np.asarray(frame)[..., ::-1]
    #Change from opencv to pillow
    frame = Image.fromarray(frame)
    #Detects an object and returns the output result and the center of the object's x-axis
    r_image, x_centor = object_YOLO.detect_image_for_Cplus2(frame)
    #Image display
    cv2.imshow("out",np.asarray(r_image)[..., ::-1])
    #1ms to display
    cv2.waitKey(1)
    #C++Returns the center of the x-axis of the object
    return x_centor

Execution result

Since it uses a trained model, it is detected as person etc. ダウンロード.gif

Since it is a sample program, even if you want to use the original model, you can use it just by changing the path in the py file.

Use applications

When do I have to call Python in C ++ ... I think some people are worried. The reason is -When you really want to incorporate deep learning processing in a C ++ compliant device (motion control board, etc.) ・ A favorite engineer (eccentric) who can't leave C ++ ・ Anyway, weirdo That's all for the joke.

comment

We sincerely hope that it will be useful for those who are worried about environmental dependence between program languages.