Recognize red objects with python

at first

It is largely due to the support of OpenCV in python, but I tried to recognize a red object from the image of the video camera, so I made an article.

Because of using OpenCV, it is possible to use C / C ++ instead of python, but if you make it with python, you can write it with a surprisingly small amount of work, it is amazing.

So I tried to write an article w

How to detect a red object?

Algorithms are cool to use, but they're not that lofty. I would like to touch on the mechanism for detecting red objects.

I think there is RGB as a standard method of expressing colors. By expressing the brightness of each of R, G, and B from 0 to 255, it becomes a 24-bit color. Then, can this RGB be used to judge red? It's actually quite difficult. For example, R = 255, G = 0, B = 0 will be red no matter who says what. But what about R = 10, G = 0, B = 0? Isn't this black rather than red? I feel like that. In other words, it is difficult to judge the hue with RGB.

Therefore, we use a data structure called HSV color space. \ (For details, go to [Wikipedia](https://ja.wikipedia.org/wiki/HSV color space) )

Now let's create a function to convert to the HSV color space. Importing OpenCV and numpy is a standard.

import cv2
import numpy as np

def find_rect_of_target_color(image):
  hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV_FULL)
  h = hsv[:, :, 0]
  s = hsv[:, :, 1]

Up to this point, the H and S components of the image have been extracted. The H component is Hue (hue). Actually, it goes around in 360 steps like red → green → blue → red and a circle. As I wrote at the beginning, it is difficult to judge the hue with RGB, but once converted to HSV, it becomes very easy to judge by the hue by looking at H.

In the case of OpenCV, H, S, and V are all held in 256 steps (when COLOR_BGR2HSV_FULL is specified), so the original H should be 360 steps, but it is rounded to 256 steps. Take that side into account and proceed. Speaking of red in Hue (hue), it is about 280 to 28 ° including the purplish range. If you recalculate this in 256 steps, I think it will be about 200 to 20. Therefore, the range where the value of H is (H <20 | H> 200) is red.

Also, pay attention to the color depth. Even if the Hue (hue) is red, if the color is light enough, it will approach white or black. It is also necessary to judge the S component and Saturation at the same time. This is usually in the range 0-255, and the higher the number, the more "vivid" it is. Here, since it is a red judgment, let's add it on the condition that the value of S is (S> 128).

If you write these together in numpy style, it will look like the following.

  mask = np.zeros(h.shape, dtype=np.uint8)
  mask[((h < 20) | (h > 200)) & (s > 128)] = 255

Now you have the mask data that shows the red part. (255 for red, 0 for non-red) But that's not the end. This mask data only shows "where the dots are reddish" in the image.

Analyze mask data

After all, as long as it detects a red object, I want to recognize a block of dots of a certain size. As it is now, it is just a point cloud, and there is no unity. In order to recognize this data, which is just a point, as a block, let's first consider what is called a "contour". Once you have a contour, you know the mass surrounded by that contour.

  contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

The combination of OpenCV and python is great. You can write it in just one line. Now let's look at the contents of contours. In fact, in the process of making the contour, it has already been arranged in chunks.

  rects = []
  for contour in contours:
    approx = cv2.convexHull(contour)
    rect = cv2.boundingRect(approx)
    rects.append(rect)

I will write the same thing again, but the combination of OpenCV and python is great. It's really concise. cv2.convexHull () is a function that calculates a convex shape that contains an uneven mass. The contour contains contour dot information, but since it is a contour, it has a very complicated shape, like a ria coast. It is cv2.convexHull () that turns this into a 2D polygon that looks like a bag that wraps around. The return value ʻapproxis an array of (X, Y). Then,cv2.boundingRect ()` calculates the rectangle in which the bag-shaped polygon fits. It returns rectangle information of the form (x, y, width, height). Up to this point, the point cloud information has been collected in a list of rectangles for each block.

Now, let's connect what I wrote separately in order.

import cv2
import numpy as np

def find_rect_of_target_color(image):
  hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV_FULL)
  h = hsv[:, :, 0]
  s = hsv[:, :, 1]
  mask = np.zeros(h.shape, dtype=np.uint8)
  mask[((h < 20) | (h > 200)) & (s > 128)] = 255
  contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
  rects = []
  for contour in contours:
    approx = cv2.convexHull(contour)
    rect = cv2.boundingRect(approx)
    rects.append(np.array(rect))
  return rects

Only this. With just this, we have a function that returns an array of rectangles that identify the red spots in a given image.

Analyze video from video camera

After making it so far, I would like to confirm its operation. Let's process in real time with the image of the video camera.

if __name__ == "__main__":
  capture = cv2.VideoCapture()
  while cv2.waitKey(30) < 0:
    _, frame = capture.read()
    rects = find_rect_of_target_color(frame)
    for rect in rects:
      cv2.rectangle(frame, tuple(rect[0:2]), tuple(rect[0:2] + rect[2:4]), (0, 0, 255), thickness=2)
    cv2.imshow('red', frame)
  capture.release()
  cv2.destroyAllWindows()

how is it? Did you get the results you expected? It's my expectation, but I think it was a poor result. The reason is that both large and small rectangles are recognized as "red objects". It is nonsense to think of something like noise as an "object". Let's thin out because it is a certain size.

Here, a little recommendation is to make only the largest rectangle a "red object". To detect a red object, did you hold the red object at hand in front of the camera? In other words, I wanted to detect it. Often, what you want to detect is the largest copy.

Change as follows to search for the one with the largest area from the obtained rectangle list.

if __name__ == "__main__":
  capture = cv2.VideoCapture()
  while cv2.waitKey(30) < 0:
    _, frame = capture.read()
    rects = find_rect_of_target_color(frame)
    if len(rects) > 0:
      rect = max(rects, key=(lambda x: x[2] * x[3]))
      cv2.rectangle(frame, tuple(rect[0:2]), tuple(rect[0:2] + rect[2:4]), (0, 0, 255), thickness=2)
    cv2.imshow('red', frame)
  capture.release()
  cv2.destroyAllWindows()

So that's it for this time. I am impressed that the combination of python + OpenCV is very convenient.