[PYTHON] An introduction to OpenCV for machine learning

In order to perform machine learning, you often want to cut out only a specific object (area) from an image and recognize it or create learning data. In this article, I would like to introduce how to use OpenCV, which has so many functions, focusing on the functions used for such machine learning. Specifically, we will focus on the following modules.

image CVPR 2015 Tutorials

The basic cutting procedure is as follows. In the following, I will explain according to this process.

In addition, miniconda is used to build the OpenCV environment. If you install this and type the following command, the environment construction is already completed.

(* The name of the virtual environment does not have to be cv_env. Also, if it is Mac / Linux, activate will drop, so some support is required. For details, click here](http://qiita.com/icoxfog417/items / 950b8af9100b64c0d8f9))

The code introduced this time is published in the following repository. I hope you can refer to it as needed.

icoxfog417/cv_tutorial

Preprocessing

When detecting an object, it is convenient to have a "clear outline" and "continuous".

image

Effective methods for this are "threshold processing" and "filtering (blurring)". This section will focus on these two. In addition, since image processing is usually grayscaled in advance, that is also mentioned.

Grayscale

Since color information is rarely needed in image processing, it is very often grayscaled in advance. However, please note that RGB information is often required for final use in machine learning, so when cropping from an image, the color must be used first.

Grayscale color images with OpenCV is very easy. Just specify cv2.COLOR_BGR2GRAY in cv2.cvtColor.

import cv2


def to_grayscale(path):
    img = cv2.imread(path)
    grayed = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    return grayed

As the name of cv2.COLOR_BGR2GRAY suggests, the color information of the image loaded by cv2.imread is loaded in the order of BGR (blue-green-red). The variable that read the image is a matrix (of numpy), but if you check its size, it is as follows.

img = cv2.imread(IMAGE_PATH)
img.shape
>>> (348, 800, 3)

This means that the loaded image is represented by a 348x800x3 matrix. The image looks like the figure below.

image

In addition, matplotlib, which is often used to display images, expects images to come in in RGB. Therefore, if you put the image read by OpenCV into matplotlib as it is, it will be as follows (the original image is on the left, the one read by OpenCV is displayed by matplotlib as it is).

image Image Source

Therefore, when displaying with matplotlib, it is necessary to change the order of colors as follows.

def to_matplotlib_format(img):
    return cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

Threshold processing

Threshold processing is image processing based on whether or not a certain threshold is exceeded. For example, it is a process such as setting all areas where the brightness does not reach a certain value to 0. This makes it possible to drop the background and emphasize the outline, and process it as shown in the figure below (the left side is the original image, the right side is the one with threshold processing).

image

Threshold processing in OpenCV is executed by cv2.threshold can. The main parameters here are the threshold thresh, the upper limit of the value maxval, and the threshold processing type type.

The table below summarizes the types of type for threshold treatment and how the threshold (thresh) / upper limit (maxValue) is used at that time.

Threshold Type over thresh :arrow_up_small: under thresh :arrow_down_small:
THRESH_BINARY maxValue 0
THRESH_BINARY_INV 0 maxValue
THRESH_TRUNC threshold (as is)
THRESH_TOZERO (as is) 0
THRESH_TOZERO_INV 0 (as is)

(as is) means that the value of the original image is used as is. If you want to know more about threshold processing, the following materials will be helpful.

OpenCV Threshold ( Python , C++ )

In the bird image used this time, in addition to removing the blue background, the boundary (bright) of the bird's wings is made clear.

And finally, these two processing results are merged.

def binary_threshold(path):
    img = cv2.imread(path)
    grayed = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    under_thresh = 105
    upper_thresh = 145
    maxValue = 255
    th, drop_back = cv2.threshold(grayed, under_thresh, maxValue, cv2.THRESH_BINARY)
    th, clarify_born = cv2.threshold(grayed, upper_thresh, maxValue, cv2.THRESH_BINARY_INV)
    merged = np.minimum(drop_back, clarify_born)
    return merged

If you are not sure what the value of thresh should be, you can check the brightness with a paint tool. On Windows, you can check it with the standard paint tool dropper.

image

If you use ʻadaptiveThreshold`, it will determine an appropriate threshold while looking at the surrounding pixels, so you may want to try it once. Please refer to the following document for details.

Image Thresholding

Threshold processing by color

Specific colors using cv2.inRange It is also possible to extract the part of. In the following, the blue part of the background is detected and masked.

image

def mask_blue(path):
    img = cv2.imread(path)
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

    blue_min = np.array([100, 170, 200], np.uint8)
    blue_max = np.array([120, 180, 255], np.uint8)
    
    blue_region = cv2.inRange(hsv, blue_min, blue_max)
    white = np.full(img.shape, 255, dtype=img.dtype)
    background = cv2.bitwise_and(white, white, mask=blue_region)  # detected blue area becomes white
    
    inv_mask = cv2.bitwise_not(blue_region)  # make mask for not-blue area
    extracted = cv2.bitwise_and(img, img, mask=inv_mask)
    
    masked = cv2.add(extracted, background)
    
    return masked

The blue_region obtained by cv2.inRange is the specified color region. Note that blue_region is represented in grayscale, with higher values (255 = closer to white) where it is found. In addition, when using cv2.inRange, it is necessary to change the image to HSV representation, and it is also necessary to specify the color range accordingly. What is HSV expression? It is easy to understand by looking at the figure below.

image HSL and HSV

However, the value of HSV specified by OpenCV is a little quirky, so it is quite difficult to estimate the value from the paint tool as described above.

General range of values OpenCV
H 0 - 360 0 - 180
S 0 - 100 0 - 255
V 0 - 100 0 - 255

Therefore, it is faster to actually look at the values in the matrix if the specification does not seem to work very well. You can cut out the matrix value (color value) of the specified area with the feeling of ʻimg [10:20, 10:20] `, so if you check it, you can specify it pinpointly (in fact, this time the paint value is It didn't help at all, so I specified it this way).

After that, the image is created by adding background, which makes the area of blue_region all white, and ʻextracted, which extracts the area other than blue_region. bitwise_and / bitwise_not` is a useful function for doing this kind of masking.

The above is the explanation of threshold processing.

Smoothing

If the outline of the image is not clear or the background is dark, the outline may not be removed or the background may remain even if threshold processing is applied. In the example below, the gravel at your feet remains fine and the outline is jagged.

image Piping plover chick with band at two weeks

In such cases, it is a good idea to use a filter for smoothing. To put it simply, filtering is a process that blurs an image, but by blurring an image, it is possible to detect only "points that are clearly visible even if they are blurred" and ignore points that disappear if they are blurred. I can do it. The following is an example of applying a Gaussian filter using Gaussian Blur (left figure) and then performing threshold processing (right figure).

def blur(img):
    filtered = cv2.GaussianBlur(img, (11, 11), 0)
    return filtered

image

The fine details of the image have been lost, but you can see that the characteristic parts remain together and the noise that was often in the background has disappeared. Please refer to the following OpenCV official documentation for filters other than Gaussian Blur.

Smoothing Images

Another technique used to smooth images is morphology. This is a method of removing noise and emphasizing contours by using image expansion / contraction processing. The following is an image of a typical method in morphology.

image Morphology

The theoretical details are omitted here, but you can think of it as a type of filter. OpenCV has cv2.dilate, cv2.erode which can perform the above morphology processing, and convenient cv2.morphologyEx which can apply Opening / Closing continuously. This time, I will use cv2.morphologyEx to perform smoothing.

def morph(img):
    kernel = np.ones((3, 3),np.uint8)
    opened = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel, iterations=2)
    return opened

image

This time, probably because the background color is dark, it became difficult to thicken the area with CLOSE, so I tried processing in the direction of making the distance between the areas as large as possible by OPEN. However, since noise still remains, it is processed in combination with a filter.

def morph_and_blur(img):
    kernel = np.ones((3, 3),np.uint8)
    m = cv2.GaussianBlur(img, (3, 3), 0)
    m = cv2.morphologyEx(m, cv2.MORPH_OPEN, kernel, iterations=2)
    m = cv2.GaussianBlur(m, (5, 5), 0)
    return m

image

It's more like information remains than simply filtering. Also, you can see that the areas that were previously connected are now firmly independent by applying Opening. The following is detailed about morphology processing, so please refer to it.

In fact, when the background is dark, it is quite difficult to clarify the area from there if it is grayscale. Therefore, if the background has a color that you can recognize, it is better to mask it with a color and then process it.

The above is the explanation of preprocessing. From here, I would like to finally detect the object from the image after preprocessing.

Object detection

Contour detection

Up to this point, I think that the object to be recognized has been clarified by preprocessing, so we will use it to detect the contour.

image

In OpenCV, you can easily detect contours by using cv2.findContours.

def detect_contour(path, min_size):
    contoured = cv2.imread(path)
    forcrop = cv2.imread(path)

    # make binary image
    birds = binary_threshold_for_birds(path)
    birds = cv2.bitwise_not(birds)
    
    # detect contour
    im2, contours, hierarchy = cv2.findContours(birds, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    crops = []
    # draw contour
    for c in contours:
        if cv2.contourArea(c) < min_size:
            continue
        
        # rectangle area
        x, y, w, h = cv2.boundingRect(c)
        x, y, w, h = padding_position(x, y, w, h, 5)
        
        # crop the image
        cropped = forcrop[y:(y + h), x:(x + w)]
        cropped = resize_image(cropped, (210, 210))
        crops.append(cropped)
        
        # draw contour
        cv2.drawContours(contoured, c, -1, (0, 0, 255), 3)  # contour
        cv2.rectangle(contoured, (x, y), (x + w, y + h), (0, 255, 0), 3)  #rectangle contour
                
    return contoured, crops


def padding_position(x, y, w, h, p):
    return x - p, y - p, w + p * 2, h + p * 2

binary_threshold_for_birds is a function for threshold processing of the bird image used this time (= preprocessing). Since the image output with this is the white background image introduced earlier, this is inverted and used for area detection. It is difficult to understand, but in the case of black and white, "white" has a higher value (255), so when performing contour detection, it is necessary to give an image whose contour is drawn in white as input.

All you have to do now is run cv2.findContours. The contours detected by this can be easily drawn on the image with cv2.drawContours. You can also use cv2.boundingRect to get the coordinates of a rectangle that fits the contour. However, this is a close battle, so this time I'm using padding_position to give some room to the surroundings. For cv2.findContours, please refer to the official document as well.

Contours : Getting Started

In addition, when the user is not automatic, it is possible to detect the object in the enclosed area by using a technique called graph cut. I won't go into details, but I think it's useful when creating tools for annotation, so if you are interested, please refer to the following.

Interactive Foreground Extraction using GrabCut Algorithm

Approximation of contour

OpenCV provides several functions that approximate the detected contour. For example, ʻapproxPolyDP` linearly approximates the detected contour, and if the contour is straight, it is recommended to use this to cut out. Below is the contour detected by the red dotted line, and the green line is a straight line approximation.

image Image source

def various_contours(path):
    color = cv2.imread(path)
    grayed = cv2.cvtColor(color, cv2.COLOR_BGR2GRAY)
    _, binary = cv2.threshold(grayed, 218, 255, cv2.THRESH_BINARY)
    inv = cv2.bitwise_not(binary)    
    _, contours, _ = cv2.findContours(inv, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    for c in contours:
        if cv2.contourArea(c) < 90:
            continue
        
        epsilon = 0.01 * cv2.arcLength(c, True)
        approx = cv2.approxPolyDP(c, epsilon, True)
        cv2.drawContours(color, c, -1, (0, 0, 255), 3)
        cv2.drawContours(color, [approx], -1, (0, 255, 0), 3)
        
    plt.imshow(cv2.cvtColor(color, cv2.COLOR_BGR2RGB))
    
various_contours(IMG_FOR_CONTOUR)

cv2.arcLength is the length of the contour, which is used to calculate the minimum straight line length ʻepsilon`. Now you can adjust how fine the straight line is. For other functions, please refer to the tutorial below for how to use and explain them.

Contour Features

Cutting out the detection area

Now that we know the area, in order to apply it to a machine learning model, we need to cut it out to the size that the model expects. To do this, I created a function called resize_image this time.

def resize_image(img, size):
    # size is enough to img
    img_size = img.shape[:2]
    if img_size[0] > size[1] or img_size[1] > size[0]:
        raise Exception("img is larger than size")
    
    # centering
    row = (size[1] - img_size[0]) // 2
    col = (size[0] - img_size[1]) // 2
    resized = np.zeros(list(size) + [img.shape[2]], dtype=np.uint8)
    resized[row:(row + img.shape[0]), col:(col + img.shape[1])] = img
        
    # filling
    mask = np.full(size, 255, dtype=np.uint8)
    mask[row:(row + img.shape[0]), col:(col + img.shape[1])] = 0
    filled = cv2.inpaint(resized, mask, 3, cv2.INPAINT_TELEA)    
    
    return filled

This function is formed by the following steps.

OpenCV also has a resize function, but if you use it, the cropped image will be forcibly adjusted to the specified size, and the image will be distorted. Therefore, this time, we take a method of preparing a canvas of a size that fits the cut out image, placing the cut out image in the center, and filling the surroundings. Cv2.inpaint used to fill in the blanks is originally a function to restore defects in the image. However, this time I am using it to fill the surroundings.

The image actually cut out is as follows. I think that it is almost exactly complemented, but the color of the beak on the second piece has grown a little. In these cases, you need to adjust the padding so that it is filled with just the background color.

image

Image alignment

The position on the image where the object is moved is an important point in recognition. CNN, which is used recently, will do it well even if it is slightly misaligned due to convolution, but if you correct it, the accuracy will be greatly improved. Therefore, here we will explain the position correction of the image after cropping.

The figure below is an example of aligning the images. The first row is the base image, and the second and subsequent rows are corrected to align with the first row image (the left side is before correction, the right side is after correction).

image Image Source: image 1, [image 2](http://www.publicdomainpictures.net/view- image.php? image = 51893 & picture = & jazyk = JP), image 3

After applying the correction, I think that the positions of the birds are almost the same. This is corrected using findTransformECC referring to the following site.

Image Alignment (ECC) in OpenCV ( C++ / Python )

def align(base_img, target_img, warp_mode=cv2.MOTION_TRANSLATION, number_of_iterations=5000, termination_eps=1e-10):
    base_gray = cv2.cvtColor(base_img, cv2.COLOR_BGR2GRAY)
    target_gray = cv2.cvtColor(target_img, cv2.COLOR_BGR2GRAY)
    
    # prepare transformation matrix
    if warp_mode == cv2.MOTION_HOMOGRAPHY:
        warp_matrix = np.eye(3, 3, dtype=np.float32)
    else :
        warp_matrix = np.eye(2, 3, dtype=np.float32)
    
    criteria = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, number_of_iterations,  termination_eps)
    sz = base_img.shape
    
    # estimate transformation
    try:
        (cc, warp_matrix) = cv2.findTransformECC(base_gray, target_gray, warp_matrix, warp_mode, criteria)

        # execute transform
        if warp_mode == cv2.MOTION_HOMOGRAPHY :
            # Use warpPerspective for Homography 
            aligned = cv2.warpPerspective(target_img, warp_matrix, (sz[1], sz[0]), flags=cv2.INTER_LINEAR + cv2.WARP_INVERSE_MAP)
        else :
            # Use warpAffine for Translation, Euclidean and Affine
            aligned = cv2.warpAffine(target_img, warp_matrix, (sz[1],sz[0]), flags=cv2.INTER_LINEAR + cv2.WARP_INVERSE_MAP)
    
        return aligned
    except Exception as ex:
        print("can not align the image")
        return target_img

findTransformECC is simply a function that looks for similarities between two images and puts the estimation result of what kind of movement was done in warp_matrix. Originally, it is for analyzing what kind of movement occurred on continuous frames such as movies, so the positions may not be aligned unless the images are quite the same. The above also seem to be casually aligned, but it was difficult to choose a photo that could be correlated (the exception is that if there is no correlation, it will not converge). .. ..

If there are feature points (eyes, nose, mouth, etc.) that are common to all images such as the face, conversion can be applied based on the position of each feature point. ʻEstimateRigidTransform` can be used for this.

def face_align(base, base_position, target, target_position):
    sz = base.shape
    fsize = min(len(base_position), len(target_position))  # adjust feature size
    tform = cv2.estimateRigidTransform(target_position[:fsize], base_position[:fsize], False)
    aligned = cv2.warpAffine(target, tform, (sz[1], sz[0]))
    return aligned

Since there are cases where the eyes cannot be detected depending on the photo, the conversion is performed according to the smaller detected feature amount in the above (however, please note that in this case, the order in which the feature amount is inserted must be aligned. ).

image Image source

After the conversion, you can see that the positions of the faces are aligned. There is also a detailed introduction to face alignment below, so please refer to it.

Average Face : OpenCV ( C++ / Python ) Tutorial

Object recognition

In the above, contour detection was performed by itself, but OpenCV has a trained model for objects that often detect objects such as faces and bodies, and object recognition can be performed by using this. This trained model file is called Cascade Classifier, and you can also create your own. There are some that are open to the public, so if you are interested, please refer to them as they are summarized below.

Cascade Classifier is in (virtual environment folder) \ Library \ etc \ haarcascades when installed with pip (for Windows / miniconda. I think it depends on the environment). You may want to experiment to see if there is something that seems to suit your purpose. This time, I would like to follow the official tutorial below to detect faces.

Face Detection using Haar Cascades

The result of the actual detection is as follows. The detection of the right eye has failed probably because it is hidden by the hair. .. ..

image Image source

The code is almost as per the tutorial. Please note that the location of the Cascade File depends on the environment as described above (if the path is not successful, you will get an error such as ʻerror: (-215)! Empty () in function`).

def face_detection(path):
    face_cascade = cv2.CascadeClassifier(CASCADE_DIR + "/haarcascade_frontalface_default.xml")
    eye_cascade = cv2.CascadeClassifier(CASCADE_DIR + "/haarcascade_eye.xml")
    
    img = cv2.imread(path)
    grayed = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    faces = face_cascade.detectMultiScale(grayed, 1.3, 5)
    for (x, y, w, h) in faces:
        img = cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)
        roi_gray = grayed[y:y+h, x:x+w]
        roi_color = img[y:y+h, x:x+w]
        eyes = eye_cascade.detectMultiScale(roi_gray)
        for (ex, ey, ew, eh) in eyes:
            cv2.rectangle(roi_color, (ex,ey), (ex + ew, ey + eh), (0, 255, 0), 2)

    plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))

face_detection(IMG_FACE)

The basic flow is to read the file with cv2.CascadeClassifier, create Classifier, and detect with detectMultiScale. This time, only grayscale is used, but I think that more reliable detection can be achieved by performing the above-mentioned threshold processing. For cutting out the detected part, refer to "Cut out the detection area" in the previous section.

If the face is tilted, it will not be detected properly. There are two ways to do this: the eye / mouth is detected first to determine the tilt, or the image is simply rotated gradually to attempt detection. The trade-off is that the former is less computationally complicated but cumbersome, and the latter is easier but more computationally expensive. The following describes in detail the method of detecting while rotating the image, so please refer to it.

Preparing for learning

Up to this point, you have been able to cut out an image of the target object from the image. All you have to do now is put the collected images into a machine learning model. However, various preprocessing is required when inputting images to the learning model. This point is summarized below.

[Implementing Convolutional Neural Network / Preprocessing data](http://qiita.com/icoxfog417/items/5aa1b3f87bb294f84bac#%E3%83%87%E3%83%BC%E3%82%BF%E3%81] % AE% E5% 89% 8D% E5% 87% A6% E7% 90% 86)

To extract the points, the following processing is required.

That is all for the explanation. Please use OpenCV well and let them learn various things.

References

Recommended Posts

An introduction to OpenCV for machine learning
An introduction to Python for machine learning
An introduction to machine learning
An introduction to machine learning for bot developers
Introduction to machine learning
Super introduction to machine learning
[For beginners] Introduction to vectorization in machine learning
An introduction to machine learning from a simple perceptron
Introduction to machine learning Note writing
An introduction to Mercurial for non-engineers
Introduction to Machine Learning Library SHOGUN
An introduction to Python for non-engineers
Introduction to Machine Learning: How Models Work
Introduction to ClearML-Easy to manage machine learning experiments-
An introduction to Python for C programmers
Before the introduction to machine learning. ~ Technology required for machine learning other than machine learning ~
Python learning memo for machine learning by Chainer Chapter 10 Introduction to Cupy
Python learning memo for machine learning by Chainer Chapter 9 Introduction to scikit-learn
Introduction to OpenCV (python)-(2)
[Python] Easy introduction to machine learning with python (SVM)
[Super Introduction to Machine Learning] Learn Pytorch tutorials
An introduction to object-oriented programming for beginners by beginners
An introduction to statistical modeling for data analysis
[Super Introduction to Machine Learning] Learn Pytorch tutorials
An introduction to voice analysis for music apps
Take the free "Introduction to Python for Machine Learning" online until 4/27 application
Introduction to Deep Learning (1) --Chainer is explained in an easy-to-understand manner for beginners-
Data set for machine learning
[Learning memorandum] Introduction to vim
Japanese preprocessing for machine learning
Build an interactive environment for machine learning in Python
An introduction to private TensorFlow
Preparing to start "Python machine learning programming" (for macOS)
Introduction to Deep Learning ~ Learning Rules ~
An introduction to Python Programming
How to Introduce IPython (Python2) to Mac OS X-Preparation for Introduction to Machine Learning Theory-
An introduction to Bayesian optimization
Deep Reinforcement Learning 1 Introduction to Reinforcement Learning
Introduction to Python For, While
Introduction to Deep Learning ~ Backpropagation ~
Python learning notes for machine learning with Chainer Chapters 11 and 12 Introduction to Pandas Matplotlib
Everything for beginners to be able to do machine learning
I tried to build an environment for machine learning with Python (Mac OS X)
Rebuilding an environment for machine learning with Miniconda (Windows version)
Build an environment for machine learning using Python on MacOSX
Introduction to Machine Learning with scikit-learn-From data acquisition to parameter optimization
Made icrawler easier to use for machine learning data collection
For those who want to start machine learning with TensorFlow2
How to use machine learning for work? 03_Python coding procedure
Machine learning to learn with Nogizaka46 and Keyakizaka46 Part 1 Introduction
[Introduction to Reinforcement Learning] Reinforcement learning to try moving for the time being
Introduction to Deep Learning ~ Function Approximation ~
Introduction to Deep Learning ~ Coding Preparation ~
<For beginners> python library <For machine learning>
Machine learning meeting information for HRTech
[Recommended tagging for machine learning # 4] Machine learning script ...?
Introduction to Deep Learning ~ Dropout Edition ~
Introduction to Deep Learning ~ Forward Propagation ~
Introduction to Deep Learning ~ CNN Experiment ~
Introduction to image analysis opencv python
How to collect machine learning data