Problems and countermeasures for Otsu's binarization overflow in Python

Introduction

Otsu's binarization is an algorithm that automatically determines the threshold value when binarizing an image. It is already implemented in libraries such as OpenCV, but I found an implementation in Python, so I touched it. There is a problem with the title, but I think it was probably caused by my bad execution environment.

code

The code itself was not written by me, but a quote from a blog [1] drawn by others. is.

ots.py


#Binarization of Otsu (gray argument is a grayscale image)
def threshold_otsu(gray, min_value=0, max_value=255):

    #Histogram calculation
    hist = [np.sum(gray == i) for i in range(256)]

    s_max = (0,-10)

    for th in range(256):
        
        #Calculate the number of pixels in class 1 and class 2
        n1 = sum(hist[:th])
        n2 = sum(hist[th:])
        
        #Calculate the average of class 1 and class 2 pixel values
        if n1 == 0 : mu1 = 0
        else : mu1 = sum([i * hist[i] for i in range(0,th)]) / n1   
        if n2 == 0 : mu2 = 0
        else : mu2 = sum([i * hist[i] for i in range(th, 256)]) / n2

        #Calculate the numerator of interclass variance
        s = n1 * n2 * (mu1 - mu2) ** 2

        #Record the numerator and threshold of interclass variance when the numerator of interclass variance is maximum
        if s > s_max[1]:
            s_max = (th, s)
    
    #Get the threshold when the interclass variance is maximum
    t = s_max[0]

    #Binarization processing with the calculated threshold
    gray[gray < t] = min_value
    gray[gray >= t] = max_value

    return gray

problem

When I polarized the photo below, I got an overflow warning.

img5.jpg

RuntimeWarning: overflow encountered in long_scalars
  s = n1 * n2 * ((mu1 - mu2) ** 2)

At that time, the threshold value becomes 137, which is far from 78 obtained when Otsu is binarized with OpenCV.

Cause

Running type (n1) and type (n2) on n1 and n2 resulted in \ <class'numpy.int32'>. Therefore, it is considered that the cause is that the calculation result exceeds the range of int32 type (-2147483648 to 2147483647) [[2]](https://jakevdp.github.io/PythonDataScienceHandbook/02.01- understanding-data-types.html).

Countermeasures

I thought about two measures. The first is to change the type of n1 and n2 to float type. By adding one line as shown below, the type of hist changes, and as a result, the types of n1 and n2 also change.

python


#Histogram calculation
hist = [np.sum(gray == i) for i in range(256)]
hist = np.array(hist, dtype=np.float64)

The second is to use log. Since the relative magnitude of s is more important than the value itself, log is taken on both sides of s and the right side is decomposed (the result of taking log is a float type). Unless the contents are 1 and minus, the order does not change even if you take the log.

python


import math
...

#Calculate the numerator of interclass variance
#s = float(n1 * n2 * ((mu1 - mu2) ** 2))
        
if n1 == 0 or n2 == 0:
    continue
if mu2 >= mu1:
    s = math.log(n1)+math.log(n2)+2*math.log(mu2-mu1)
elif mu1 < mu2:
    s = math.log(n1)+math.log(n2)+2*math.log(mu1-mu2)
...

I was able to solve it safely. The picture below is the result of running the program. otsu.jpg

In addition, the threshold value was 79 (OpenCV is below the threshold, this program is binarized below the threshold), and it was confirmed that programming works normally. Finally, in the first and second, the former is overwhelmingly easier, but since it seemed interesting, I also posted the latter.

in conclusion

It is thought that the cause of the problem is that the mold is out of shape. It was a good study because I often suffer from typing when doing image processing in Python.

Thank you for watching until the end. If you have any comments or suggestions, please do not hesitate to contact us.

Reference URL

[1]https://algorithm.joho.info/programming/python/opencv-otsu-thresholding-py/ [2]https://jakevdp.github.io/PythonDataScienceHandbook/02.01-understanding-data-types.html

Recommended Posts

Problems and countermeasures for Otsu's binarization overflow in Python
Problems and solutions when asked for MySQL db in Python 3
Combining problems in Python
Rock-paper-scissors poi in Python for beginners (answers and explanations)
Problems and countermeasures in smartphone app game development Part 1
Preferences for playing Wave in Python PyAudio and PortAudio
Recursively search for files and directories in Python and output
Problems and countermeasures in smartphone app game development Part 2
Search for strings in Python
Techniques for sorting in Python
Stack and Queue in Python
Unittest and CI in Python
Solve optimization problems in Python
About "for _ in range ():" in python
List method argument information for classes and modules in Python
Tips for coding short and easy to read in Python
Useful tricks related to list and for statements in Python
[Tips] Problems and solutions in the development of python + kivy
Check for memory leaks in Python
MIDI packages in Python midi and pretty_midi
Difference between list () and [] in Python
Check for external commands in python
Difference between == and is in python
View photos in Python and html
Sorting algorithm and implementation in Python
Manipulate files and folders in Python
About dtypes in Python and Cython
Assignments and changes in Python objects
Check and move directories in Python
Ciphertext in Python: IND-CCA2 and RSA-OAEP
Hashing data in R and Python
Function synthesis and application in Python
Create a CGH for branching a laser in Python (laser and SLM required)
Export and output files in Python
Run unittests in Python (for beginners)
Reverse Hiragana and Katakana in Python2.7
Reading and writing text in Python
[GUI in Python] PyQt5-Menu and Toolbar-
Create and read messagepacks in Python
python> array> Determine the number and initialize> mylist = [idx for idx in range (10)] / mylist = [0 for idx in range (10)] >> mylist = [0] * 10
UnionFind in python (enhanced version: strings and tuples are allowed for elements)
Build a lightweight server in Python and listen for Scratch 2 HTTP extensions
Overlapping regular expressions in Python and Java
Differences in authenticity between Python and JavaScript
Notes using cChardet and python3-chardet in Python 3.3.1.
Differences between Ruby and Python in scope
AM modulation and demodulation in Python Part 2
difference between statements (statements) and expressions (expressions) in Python
Eigenvalues and eigenvectors: Linear algebra in Python <7>
Notes on nfc.ContactlessFrontend () for nfcpy in python
Inject is recommended for DDD in Python
Line graphs and scale lines in python
Implement FIR filters in Python and C
Differences in syntax between Python and Java
Check and receive Serial port in Python (Port check)
Tips for dealing with binaries in Python
Search and play YouTube videos in Python
Summary of various for statements in Python
Type annotations for Python2 in stub files!
Difference between @classmethod and @staticmethod in Python
Difference between append and + = in Python list