In image processing, color images are often too informative when it comes to getting the information you need. Therefore, I think the general flow is to grayscale or binarize the necessary information such as characters and edges to make it easier to obtain. This time, we will use Python to binarize the image with OpenCV.
Binarization is the process of converting an image into two gradations, white and black. In binarization, a threshold (threshold) is determined in advance, and if the pixel value is larger than the threshold, it is converted to white, and if it is smaller, it is converted to black. Reference: What is binarization? Weblio Dictionary
The environment uses Google Colaboratory. The Python version is below.
import platform
print("python " + platform.python_version())
# python 3.6.9
Now let's write the code. First, import OpenCV.
import cv2
In addition, import the following to display the image in Colaboratory.
from google.colab.patches import cv2_imshow
Prepare a sample image as well. This time, we will use the free image from Pixabay.
Now, let's display the prepared sample image.
img = cv2.imread(path) #path specifies where the image is placed
cv2_imshow(img)
Grayscale (Grayscale or grayscale) is a type of color representation. Grayscale represents an image, including the shades of gray between the strongest white and the weakest black. Reference: [Wikipedia](https://ja.wikipedia.org/wiki/%E3%82%B0%E3%83%AC%E3%83%BC%E3%82%B9%E3%82%B1%E3 % 83% BC% E3% 83% AB)
Grayscale images can be viewed below.
img_gray = cv2.imread(path, 0)
cv2_imshow(img_gray)
Binarization is the process of converting an image into two values (binary), white and black. It is different from the gray scale that displays between white and black in stages. A value called a threshold (threshold) is determined, and if the pixel value is larger than that, it is converted to white, and if it is smaller, it is converted to black.
Now let's display the binary image. In order to binarize, you need to use a ** grayscale image. ** ** As a trial, let's set the threshold value to 100.
threshold = 100
ret, img_th = cv2.threshold(img_gray, threshold, 255, cv2.THRESH_BINARY)
print(ret)
# 100.0
cv2_imshow(img_th)
There are two return values for cv2.threshold. The first is the threshold value and the second is the binarized image. Since 100 is specified as the threshold value this time, 100 is naturally returned as the first return value.
Now let's display the image with some thresholds.
_, img1 = cv2.threshold(img_gray, 0, 255, cv2.THRESH_BINARY)
_, img2 = cv2.threshold(img_gray, 50, 255, cv2.THRESH_BINARY)
_, img3 = cv2.threshold(img_gray, 100, 255, cv2.THRESH_BINARY)
_, img4 = cv2.threshold(img_gray, 150, 255, cv2.THRESH_BINARY)
_, img5 = cv2.threshold(img_gray, 200, 255, cv2.THRESH_BINARY)
_, img6 = cv2.threshold(img_gray, 250, 255, cv2.THRESH_BINARY)
imgs_1 = cv2.hconcat([img1, img2, img3])
imgs_2 = cv2.hconcat([img4, img5, img6])
imgs = cv2.vconcat([imgs_1, imgs_2])
cv2_imshow(imgs)
The images are binarized with thresholds of 0, 50, 100 (upper), 150, 200, 250 (lower) in order from the upper left.
In the above binarization, various threshold values were set and the image was output. It seems that we have to repeat trial and error to find out what value is appropriate as the threshold value. Otsu's binarization is the solution to such problems. Otsu's binarization (or "Otsu's method") is a method of setting a threshold value based on the histogram (brightness distribution) of an image. Roughly speaking, it is "** it sets a nice threshold without permission **".
The image of Otsu's binarization can be displayed below.
ret, img_otsu = cv2.threshold(img_gray, 0, 255, cv2.THRESH_OTSU)
print(ret)
cv2_imshow(img_otsu)
# 126.0
The first return value of cv2.threshold is the threshold. This time, the threshold set by Otsu's binarization is 126.
adaptive threshold Both general binarization and Otsu binarization were performed with a certain threshold for the entire image. On the other hand, adaptive threshold (called "adaptive threshold processing" in Japanese) is to binarize by changing the threshold depending on the location. Depending on the location, it may be particularly dark or bright, and it may not be appropriate to binarize with a certain threshold for the entire image. An effective method in such cases is the adaptive threshold.
Images with adaptive threshold can be displayed below.
img_adap = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 3, 1)
cv2_imshow(img_adap)
Here, the 3rd and 4th arguments of cv2.adaptiveThreshold specify the threshold calculation method. In this case, cv2.ADAPTIVE_THRESH_MEAN_C and cv2.THRESH_BINARY are used. For details, please refer to the official document.
The third argument, 3 is the block size, that is, how large the area is to be targeted. This must specify an odd number greater than ** 1. The sixth argument, 1 is called a subtraction constant, which is a constant subtracted from the calculated threshold.
As a trial, let's display images in various block sizes.
img1 = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 3, 1)
img2 = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 5, 1)
img3 = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 7, 1)
img4 = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 9, 1)
img5 = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 1)
img6 = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 13, 1)
imgs_1 = cv2.hconcat([img1, img2, img3])
imgs_2 = cv2.hconcat([img4, img5, img6])
imgs = cv2.vconcat([imgs_1, imgs_2])
cv2_imshow(imgs)
From the upper left, the block sizes are 3, 5, 7 (upper), 9, 11, 13 (lower).
This time, I used Python to binarize the image with OpenCV.
If you want to extract characters, edges, etc. from an image, try the binarization method.
For more details on binarization, please refer to the following.
Recommended Posts