[PYTHON] Image binarization using linear discriminant analysis

Goal of this article

Understand Fisher's linear discriminant analysis and be able to binarize images using it.

Environment to use

Software name version
Python 3.4 or 3.5
Pillow 3.1
Numpy 1.10
Matplotlib 1.5.0

Fisher's linear discriminant analysis

Overview

Objective function

Find the parameter $ w $ that maximizes the function below.

\begin{equation}
J(w) = \frac{w^{T}S_{B}w}{w^{T}S_{W}w}
\end{equation}

In-class covariance matrix

Each class when $ N $ data is classified into $ k $ classes so that the number of data in each class is $ N_ {i} (i = 0, ..., k-1) $ The mean of the covariance matrix of.

\begin{equation}
S_{W}=\sum_{i=0}^{k-1}\frac{N_{i}}{N}S_{i}
\end{equation}

Interclass covariance matrix

A covariance matrix of the mean vectors for each class.

\begin{equation}
S_{B}=\sum_{i=0}^{k-1}\frac{N_{i}}{N}(\mu_{i}-\mu)^{T}(\mu_{i}-\mu)
\end{equation}

Relationship between covariance matrix of all data, intraclass covariance matrix, and interclass covariance matrix

The sum of the intraclass covariance matrix ($ S_ {W} ) and the interclass covariance matrix ( S_ {W} ) is equal to the covariance matrix ( S $) for all the data. Hereinafter, the covariance matrix of all the data will be referred to as a total covariance matrix.

\begin{align}
S_{W} + S_{B} &=\sum_{i=0}^{k-1}\frac{N_{i}}{N}S_{i}+\sum_{i=0}^{k-1}\frac{N_{i}}{N}(\mu_{i}-\mu)^{T}(\mu_{i}-\mu)\\
&=\frac{1}{N}\sum_{i=0}^{k-1}(N_{i}(\sum_{j \in C_{i}}\frac{x_{j}^{T}x_{j}}{N_{i}}-\mu_{i}^{T}\mu_{i})+N_{i}\mu_{i}^{T}\mu_{i})-\mu^{T}\mu\\
&=\frac{1}{N}\sum_{i=0}^{k-1}(\sum_{j \in C_{i}}x_{j}^{T}x_{j})-\mu^{T}\mu\\
&=\frac{1}{N}\sum_{l=0}^{N-1}x_{l}^{T}x_{l}-\mu^{T}\mu\\
&=S
\end{align} 

Determining parameters by Lagrangian

Even if you replace $ w $ in the objective function $ J (w) = \ frac {w ^ {T} S_ {B} w} {w ^ {T} S_ {W} w} $ with $ \ alpha w $ Since the value does not change, it can be replaced with $ w ^ {T} S_ {W} w = 1 $, and the problem of finding the parameter $ w $ that maximizes $ J (w) $ is $ w ^ {T. We come down to the problem of finding $ w $ that maximizes $ w ^ {T} S_ {B} w $ under the constraint of} S_ {W} w = 1 $.

\begin{equation}
argmax_{w}(w^{T}S_{B}w),
\hspace{3mm}subject\hspace{3mm}to\hspace{3mm}
w^{T}S_{W}w=1
\end{equation}

Convert this to a minimization problem.

\begin{equation}
argmin_{w}(-\frac{1}{2}w^{T}S_{B}w),
\hspace{3mm}subject\hspace{3mm}to\hspace{3mm}
w^{T}S_{W}w=1
\end{equation}

Then, the Lagrangian becomes as follows.

\begin{equation}
L=-\frac{1}{2}w^{T}S_{B}w+\frac{1}{2}\lambda (w^{T}S_{W}w - 1)
\end{equation}

Differentiate $ L $ with $ w $ and set it as $ 0 $ to get the following equation. This is called the generalized eigenvalue problem.

\begin{equation}
S_{B}w=\lambda S_{W}w
\end{equation}

Assuming $ S_ {W} $ is regular, multiply both sides by its inverse matrix $ S_ {W} ^ {-1} $ from the left to get the usual eigenvalue problem.

\begin{equation}
S_{W}^{-1}S_{B}w=\lambda w
\end{equation}

The eigenvector corresponding to the largest of the eigenvalues obtained as a result of solving the generalized eigenvalue problem is adopted as $ w $. This selection criterion is obtained by considering the dual problem of this optimization problem.

Image binarization using Fisher's discriminant analysis

Overview

Objective function

The objective function $ J (w) $ of Fisher's discriminant in the previous chapter is the following equation using the relationship of intraclass variance, interclass variance, and total variance, noting that the grayscale image data is one-dimensional. To get. Note that the parameter $ w $ is one-dimensional.

\begin{equation}
J(w) = \frac{\sigma_{B}}{\sigma_{W}}=\frac{\sigma_{B}}{\sigma -\sigma_{B}}
\end{equation}

Threshold determination

Since the total variance $ \ sigma $ is constant, we can determine the threshold $ t $ that makes up $ \ sigma_ {B} $ that maximizes $ J (w) $.

$ \ sigma_ {B} $ can be calculated by the following equation.

\begin{equation}
\sigma_{B}=\frac{N_{0}}{N}(\mu_{0}-\mu)^2 + \frac{N_{1}}{N}(\mu_{1}-\mu)^{2}
\end{equation}

If the minimum and maximum pixel values of a grayscale image are $ x_ {min} $ and $ x_ {max} $, respectively, then $ x_ {min} \ leq t \ leq x_ {max} $. Calculate $ \ sigma_ {B} $ for all such $ t $, find the one that maximizes $ J (w) $, and use it as the threshold.

program

# This file is part of "Junya's self learning project about digital image processing."
#
# "Junya's self learning project about digital image processing"
# is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# "Junya's self learning project about digital image processing"
# is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with Foobar.  If not, see <http://www.gnu.org/licenses/>.
#
# (c) Junya Kaneko <[email protected]>

from PIL import Image
import numpy as np
from matplotlib import pyplot as plt


def discriminant_analysis(image):
    _image = image.reshape(image.shape[0] * image.shape[1])
    ave = np.average(_image)
    t = j0 = 0
    for _t in range(np.min(_image) + 1, np.max(_image) + 1):
        blacks = _image[_image < _t]
        whites = _image[_image >= _t]
        sigma_b = len(blacks) * np.power(np.average(blacks) - ave, 2) /len(_image) + len(whites) * np.power((np.average(whites) - ave), 2) / len(_image)
        j1 = sigma_b / (np.var(_image) - sigma_b)
        if j0 < j1:
            t = _t
            j0 = j1
    return t


if __name__ == '__main__':
    image = np.array(Image.open('img/test.jpg').convert('L'), dtype=np.uint8)
    bin_image = np.array(image)
    t = discriminant_analysis(image)

    for i in range(image.shape[0]):
        for j in range(image.shape[1]):
            bin_image[i, j] = 255 if image[i, j] >= t else 0

    plt.subplot(1, 2, 1)
    plt.title('Original')
    plt.imshow(image, cmap='Greys_r')

    plt.subplot(1, 2, 2)
    plt.title('Binary')
    plt.imshow(bin_image, cmap='Greys_r')

    plt.show()

The source code can be found on Github.

The execution result is as shown in the figure below. The original image is [Dictionary Wiki](http://ja.characters.wikia.com/wiki/%E3%83%95%E3%82%A1%E3%82%A4%E3%83%AB: % E4% B8% 89% E4% BD% 93% E7% BF% 92% E5% AD% 97% E3% 83% BB% E6% A5% B7_-_% E6% 88% 91.jpg) used.

References

Recommended Posts

Image binarization using linear discriminant analysis
Machine Learning: Supervised --Linear Discriminant Analysis
Image processing with Python 100 knocks # 4 Binarization of Otsu (discriminant analysis method)
Data analysis using Python 0
Image segmentation using U-net
Orthologous analysis using OrthoFinder
Japanese morphological analysis using Janome
Linear Predictive Analysis (LPC) (= formant analysis)
Cloud image prediction using convLSTM
Linear regression method using Numpy
Data analysis using python pandas