Hi, I'm Ramu. We will implement Otsu's binarization (discriminant analysis method), which is a method to automatically determine the threshold value used for binarization.
Binarization is the process of converting an image into a monochrome image with only two colors, black and white. After determining the threshold value, replace the pixel values below the threshold value with white and the pixels with pixel values above the threshold value with black. So far, I explained in the previous binarization. This time, we will deal with the method of automatically determining this threshold.
In Otsu's binarization, the class is divided into two according to the threshold. The threshold value when the degree of separation is maximum in these two classes is the threshold value when binarizing. The parameters required to calculate the degree of separation can be calculated by the following formula.
Separation: $ X = \ dfrac {\ sigma _ {b} ^ {2}} {\ sigma _ {w} ^ {2}} $
In-class distribution: $ \ sigma _ {b} ^ {2} = \ dfrac {\ omega _ {0} \ omega _ {1}} {(\ omega _ {0} + \ omega _ {1}) ^ 2 } (M _ {0} + M _ {1}) ^ 2 $
Distribution between classes: $ \ sigma _ {b} ^ {2} = \ omega _ {0} \ sigma _ {0} ^ {2} + \ omega _ {1} \ sigma _ {1} ^ {2} $
Number of pixels belonging to class 0,1: $ \ omega _0, \ omega _1 $
Variance of pixel values belonging to classes 0,1: $ \ sigma _0, \ sigma _1 $
Average pixel values belonging to classes 0,1: $ M_0, M_1 $
Average pixel value of the entire image: $ M $
Total pixel values belonging to class 0,1: $ P_0, P_1 $
In summary, when the threshold is 0 to 255, the degree of separation should be calculated 256 times to find the threshold value that maximizes the degree of separation.
otsuBinarization.py
import numpy as np
import cv2
import matplotlib.pyplot as plt
# from statistics import variance
import statistics as st
plt.gray()
def otsuBinarization(img):
  #Image copy
  dst = img.copy()
  #Grayscale
  gray = cv2.cvtColor(dst, cv2.COLOR_BGR2GRAY)
  w,h = gray.shape
  Max = 0
  #Average pixel value of the entire image
  M = np.mean(gray)
  #Applies to all 256 threshold values
  for th in range(256):
    #Classification
    g0,g1 = gray[gray<th],gray[gray>=th]
    #Number of pixels
    w0,w1 = len(g0),len(g1)
    #Pixel value distribution
    s0_2,s1_2 = g0.var(),g1.var()
    #Pixel value average
    m0,m1 = g0.mean(),g1.mean()
    #Pixel value total
    p0,p1 = g0.sum(),g1.sum()
    #In-class distribution
    sw_2 = w0*s0_2 + w1*s1_2
    #Distribution between classes
    sb_2 = ((w0*w1) / ((w0+w1)*(w0+w1))) * ((m0-m1)*(m0-m1))
    #Separation
    if (sb_2 != 0):
      X = sb_2 / sw_2
    else:
      X = 0
    if (Max < X):
      Max = X
      t = th
  #Binarization
  idx = np.where(gray < t)
  gray[idx] = 0
  idx = np.where(gray >= t)
  gray[idx] = 255
  return gray
#Image reading
img = cv2.imread('image.jpg')
#Binarization of Otsu
mono = otsuBinarization(img)
#Save image
cv2.imwrite('result.jpg', mono)
#Image display
plt.imshow(mono)
plt.show()
 
 
The left image is the input image, the center of the image is the output image when the threshold is manually set to 128, and the right image is the output image this time. Even if the threshold value is automatically determined and binarized, the image is output without much discomfort. As an aside, my implementation doesn't use the average pixel value M for the entire image.
If you have any questions, please feel free to contact us. imori_imori's Github has the official answer, so please check that as well. .. Also, since python is a beginner, please kindly watch over and comment on any mistakes.
Recommended Posts