[PYTHON] Object recognition with openCV by traincascade

what will you do?

I want to do something like this.

That is, if there is something you want to detect in the photo, specify the coordinates.

How do you do it?

With openCV, create multiple weak learners using HOG features (features created by the gradient of brightness), Haar-Like features (features created by the brightness of the image), and correct labels, and boosting to determine .. openCV can be entered quickly with conda install -c https://conda.binstar.org/jjhelmus opencv from the anaconda installation with pyenv. If you google pyenv or anaconda, there are many ways to do it.

Let's try

--0 Directory structure ―― 1 Create correct answer data information for learning --2 Create incorrect data information for learning --3 Positive vector creation ―― 4 Learning device creation

0 Directory structure

Arbitrary directory/
　├data/
　│　├pos/               [Collected in step 1]
　│　│　├xxx001.jpg
　│　│　├xxx002.jpg
　│　│　├ ...
　│　│　└xxx100.jpg
　│　└neg/               [Collected in step 2]
　│　│　├yyy001.jpg
　│　│　├yyy002.jpg
　│　│　├ ...
　│　│　└yyy100.jpg
　│　└model/
　│　　 ├param.xml       [Created in step 4]
　│　　 ├stage0.xml      [Created in step 4]
　│　　 ├stage1.xml      [Created in step 4]
　│　　 ├stage2.xml      [Created in step 4]
　│　　 └stageX.xml      [Created in step 4]
　└src/
　 　├positive.dat       [Created in step 1]
　 　├negative.dat       [Created in step 2]
　 　├positive.vec       [Created in step 3]
　 　└create_cascade.sh  [Step 3-Used in 4]

1 Create correct answer data information for learning

The following information is required for the correct answer data.

--Original image location ――The number of things you want to distinguish --x coordinate --y coordinate

Width
height

Save this in one file [positive.dat] separated by half-width spaces.

`{positive.dat}`


/Path/To/positive/xxx001.jpg 1 200 50 50 50
/Path/To/positive/xxx002.jpg 2 150 30 40 36 230 300 55 60
・ ・ ・
/Path/To/positive/xxx100.jpg 1 150 30 40 36

If you see two things you want to identify, the format is as follows. Image path 2 1st x coordinate 1st y coordinate 1st width 1st height 2nd x coordinate 2nd y coordinate 2nd width 2nd height

The way to collect the learning images is enthusiastic. .. .. There seems to be a auxiliary tool ...

2 Create incorrect data information for learning

Incorrect data does not need to be as complicated as correct data. List the paths of images that do not show what you want to recognize in one file [negative.dat] and you're done.

`{negative.dat}`


/Path/To/negative/yyy001.jpg
/Path/To/negative/yyy002.jpg
・ ・ ・
/Path/To/negative/yyy100.jpg

There is no problem even if there is a difference in the number of positive and negative. If possible, it is advisable to include images of various sizes as well as images of the same size as the frame you want to detect with positive. This is because when detecting with cascade, the quadrangle is made into various sizes to distinguish what is in the picture.

3 Positive vector creation

`{create_cascade.sh}`


# positive.vec creation command
opencv_createsamples -info positive.dat -vec positive.vec -num 100 -w 40 -h 40

--- info: Specify the dat file created in step 1. --- vec: Specify output vector file name --- num: Number of positive.dat lines (number of positive images) --- w: Width --- h: Height

At this time, the error occurred when there were images smaller than the size of -w and -h. If you get stuck with an error, you may want to check the size of the postive image.

4 Create a learner

`{create_cascade.sh}`


#model creation command
opencv_traincascade -data /Path/To/model -vec /Path/To/src/positive.vec -bg /Path/To/src/negative.dat -numPos 100 -numNeg 100 -featureType HOG -maxFalseAlarmRate 0.1 -w 50 -h 50 -minHitRate 0.97 -numStages 17

--- data: Specify the storage location of the model file --- vec: Specify the location of positive.vec --- bg: Specify the location of negative.dat --- numPos: Specify the number of positives --- numNeg: Specify the number of negatives --- featureType: Use HOG features for HOG, LBP features for LBP, and Haar-Like features if not specified. --- maxFalseAlarmRate: Allowable false positive rate at each learning stage --- w: Width --- h: Height --- minHitRate: Minimum detection rate allowed at each learning stage --- numStages: Number of stages to create

For numPos, it is good to specify about 80 to 90% of the number actually converted to vec. If you don't do this, you will occasionally see the following error: I'm not sure, but it seems that there is an internal mechanism that does not use the image for the next learning device creation if it is judged that positive is not correct during learning. Since I played it with this, it seems that an error will occur because there is not enough data to use for learning. Also, w and h are GO with the values specified when positive.vec was created.

`{opencv_traincascade_error.log}`


OpenCV Error: Bad argument (Can not get new positive sample. The most possible reason is insufficient count of samples in given vec-file.

With this, many stageX.xml will be created in / Path / To / model while spitting out the following output.

`{opencv_traincascade.log}`


PARAMETERS:
cascadeDirName: ../../../model
vecFileName: positive.vec
bgFileName: negative.dat
numPos: 150
numNeg: 200
numStages: 17
precalcValBufSize[Mb] : 256
precalcIdxBufSize[Mb] : 256
stageType: BOOST
featureType: HOG
sampleWidth: 50
sampleHeight: 50
boostType: GAB
minHitRate: 0.97
maxFalseAlarmRate: 0.1
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100

===== TRAINING 0-stage =====
<BEGIN
POS count : consumed   150 : 150
NEG count : acceptanceRatio    200 : 1
Precalculation time: 0
+----+---------+---------+
|  N |    HR   |    FA   |
+----+---------+---------+
|   1|        1|        1|
+----+---------+---------+
|   2|        1|    0.285|
+----+---------+---------+
|   3|        1|    0.285|
+----+---------+---------+
|   4| 0.993333|    0.135|
+----+---------+---------+
|   5|        1|    0.145|
+----+---------+---------+
|   6|        1|    0.095|
+----+---------+---------+

In the table, N represents the hit rate based on the stage threshold (v_hitrate / numpos) and FA represents the false alarm based on the stage threshold (v_falsealarm / numneg).

Finally, a file called cascade.xml is created, which is used for prediction.

5 Discrimination using a learner

If you write the code to determine in python, it looks like this.

import cv2
import numpy as np

#Learner(cascade.xml)Designation of
Cascade = cv2.CascadeClassifier('../model/2/cascade.xml')
#Specifying the image to be predicted
img = cv2.imread('image.jpg', cv2.IMREAD_COLOR)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
point = Cascade.detectMultiScale(gray, 1.1, 3)

if len(point) > 0:
	for rect in point:
		cv2.rectangle(img, tuple(rect[0:2]), tuple(rect[0:2]+rect[2:4]), (0, 0,255), thickness=2)
else:
	print "no detect"

cv2.imwrite('detected.jpg', img)

In the saved detected.jpg, the image surrounded by the frame is completed. If you want to save only the image in the frame, you should be able to do it with the following.

img = img[point[0][1]:point[0][1]+point[0][3], point[0][0]:point[0][0]+point[0][2]]
cv2.imwrite('out.jpg', img)

For point, which is the return of Cascade.detect MultiScale, x, y, w, h are returned.

--x: Upper left coordinates of the square --y: Upper left coordinates of the square --w: Width --h: Vertical width

Therefore, you can crop it by specifying it with img [y: y + h, x: x + w].