I want to do something like this.
That is, if there is something you want to detect in the photo, specify the coordinates.With openCV, create multiple weak learners using HOG features (features created by the gradient of brightness), Haar-Like features (features created by the brightness of the image), and correct labels, and boosting to determine ..
openCV can be entered quickly with conda install -c https://conda.binstar.org/jjhelmus opencv
from the anaconda installation with pyenv.
If you google pyenv or anaconda, there are many ways to do it.
--0 Directory structure ―― 1 Create correct answer data information for learning --2 Create incorrect data information for learning --3 Positive vector creation ―― 4 Learning device creation
Arbitrary directory/
├data/
│ ├pos/ [Collected in step 1]
│ │ ├xxx001.jpg
│ │ ├xxx002.jpg
│ │ ├ ...
│ │ └xxx100.jpg
│ └neg/ [Collected in step 2]
│ │ ├yyy001.jpg
│ │ ├yyy002.jpg
│ │ ├ ...
│ │ └yyy100.jpg
│ └model/
│ ├param.xml [Created in step 4]
│ ├stage0.xml [Created in step 4]
│ ├stage1.xml [Created in step 4]
│ ├stage2.xml [Created in step 4]
│ └stageX.xml [Created in step 4]
└src/
├positive.dat [Created in step 1]
├negative.dat [Created in step 2]
├positive.vec [Created in step 3]
└create_cascade.sh [Step 3-Used in 4]
The following information is required for the correct answer data.
--Original image location ――The number of things you want to distinguish --x coordinate --y coordinate
Save this in one file [positive.dat] separated by half-width spaces.
{positive.dat}
/Path/To/positive/xxx001.jpg 1 200 50 50 50
/Path/To/positive/xxx002.jpg 2 150 30 40 36 230 300 55 60
・ ・ ・
/Path/To/positive/xxx100.jpg 1 150 30 40 36
If you see two things you want to identify, the format is as follows. Image path 2 1st x coordinate 1st y coordinate 1st width 1st height 2nd x coordinate 2nd y coordinate 2nd width 2nd height
The way to collect the learning images is enthusiastic. .. .. There seems to be a auxiliary tool ...
Incorrect data does not need to be as complicated as correct data. List the paths of images that do not show what you want to recognize in one file [negative.dat] and you're done.
{negative.dat}
/Path/To/negative/yyy001.jpg
/Path/To/negative/yyy002.jpg
・ ・ ・
/Path/To/negative/yyy100.jpg
There is no problem even if there is a difference in the number of positive and negative. If possible, it is advisable to include images of various sizes as well as images of the same size as the frame you want to detect with positive. This is because when detecting with cascade, the quadrangle is made into various sizes to distinguish what is in the picture.
{create_cascade.sh}
# positive.vec creation command
opencv_createsamples -info positive.dat -vec positive.vec -num 100 -w 40 -h 40
--- info: Specify the dat file created in step 1. --- vec: Specify output vector file name --- num: Number of positive.dat lines (number of positive images) --- w: Width --- h: Height
{create_cascade.sh}
#model creation command
opencv_traincascade -data /Path/To/model -vec /Path/To/src/positive.vec -bg /Path/To/src/negative.dat -numPos 100 -numNeg 100 -featureType HOG -maxFalseAlarmRate 0.1 -w 50 -h 50 -minHitRate 0.97 -numStages 17
--- data: Specify the storage location of the model file --- vec: Specify the location of positive.vec --- bg: Specify the location of negative.dat --- numPos: Specify the number of positives --- numNeg: Specify the number of negatives --- featureType: Use HOG features for HOG, LBP features for LBP, and Haar-Like features if not specified. --- maxFalseAlarmRate: Allowable false positive rate at each learning stage --- w: Width --- h: Height --- minHitRate: Minimum detection rate allowed at each learning stage --- numStages: Number of stages to create
{opencv_traincascade_error.log}
OpenCV Error: Bad argument (Can not get new positive sample. The most possible reason is insufficient count of samples in given vec-file.
With this, many stageX.xml will be created in / Path / To / model while spitting out the following output.
{opencv_traincascade.log}
PARAMETERS:
cascadeDirName: ../../../model
vecFileName: positive.vec
bgFileName: negative.dat
numPos: 150
numNeg: 200
numStages: 17
precalcValBufSize[Mb] : 256
precalcIdxBufSize[Mb] : 256
stageType: BOOST
featureType: HOG
sampleWidth: 50
sampleHeight: 50
boostType: GAB
minHitRate: 0.97
maxFalseAlarmRate: 0.1
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100
===== TRAINING 0-stage =====
<BEGIN
POS count : consumed 150 : 150
NEG count : acceptanceRatio 200 : 1
Precalculation time: 0
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 0.285|
+----+---------+---------+
| 3| 1| 0.285|
+----+---------+---------+
| 4| 0.993333| 0.135|
+----+---------+---------+
| 5| 1| 0.145|
+----+---------+---------+
| 6| 1| 0.095|
+----+---------+---------+
In the table, N represents the hit rate based on the stage threshold (v_hitrate / numpos) and FA represents the false alarm based on the stage threshold (v_falsealarm / numneg).
Finally, a file called cascade.xml is created, which is used for prediction.
If you write the code to determine in python, it looks like this.
import cv2
import numpy as np
#Learner(cascade.xml)Designation of
Cascade = cv2.CascadeClassifier('../model/2/cascade.xml')
#Specifying the image to be predicted
img = cv2.imread('image.jpg', cv2.IMREAD_COLOR)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
point = Cascade.detectMultiScale(gray, 1.1, 3)
if len(point) > 0:
for rect in point:
cv2.rectangle(img, tuple(rect[0:2]), tuple(rect[0:2]+rect[2:4]), (0, 0,255), thickness=2)
else:
print "no detect"
cv2.imwrite('detected.jpg', img)
In the saved detected.jpg, the image surrounded by the frame is completed. If you want to save only the image in the frame, you should be able to do it with the following.
img = img[point[0][1]:point[0][1]+point[0][3], point[0][0]:point[0][0]+point[0][2]]
cv2.imwrite('out.jpg', img)
For point, which is the return of Cascade.detect MultiScale, x, y, w, h are returned.
--x: Upper left coordinates of the square --y: Upper left coordinates of the square --w: Width --h: Vertical width
Therefore, you can crop it by specifying it with img [y: y + h, x: x + w].
Recommended Posts