Overview

Implemented Chapter 14 "Image Noise Removal" for sparse modeling.

Wavelet reduction
Duplicate patch-based DCT reduction
Duplicate patch-based DCT learning reduction curve
Duplicate patch-based DCT global learning reduction curve
OMP noise removal with redundant DCT dictionary
OMP noise removal with K-SVD dictionary
NL-means
BM3D

Was compared.

notebook ch14-01.ipynb ch14-02.ipynb ch14-03.ipynb ch14-04.ipynb ch14-05.ipynb

result

The numbers are [peak signal to noise ratio (PSNR)](https://ja.wikipedia.org/wiki/peak signal to noise ratio) [db]

K-SVD Although it takes a lot of calculation time, NL-means , BM3D was stronger and overgrown. Maybe it's implemented ...

Method

Test image

Gaussian noise with σ = 20 was added to Barbara and used as a test image. Noise was removed by each method.

Wavelet reduction

Wavelet transform was performed and hard threshold processing was performed. Performance changed depending on the threshold.

Duplicate patch-based DCT reduction

An 8x8 patch was extracted from the image. The patch was DCT transformed and hard thresholded. The average of the patch overlaps was taken. Performance changed depending on the threshold.

Reduction curve learning

The threshold processing can be regarded as a curve showing the relationship between the input value and the output value. By polynomial fitting, the optimum reduction curve was learned from a pair of patches with and without noise. $ F_{local}(S) = \Sigma_{k=1}^{M}||p_{k}^{0}-AS\\{A^{T}p_{k}\\}||^{2}_{2}$

S is threshold processing, $ A ^ {T} $ is DCT transform, $ p ^ {0} $ is noise-free patch, and M is the total number of training data. The parameter of S (polynomial coefficient) that minimizes $ F_ {local} $ was calculated by the least squares method.

Learn the reduction curve for each element of the DCT. The patch size was $ 6 \ times 6 $. Since the non-redundant DCT is used, the number of elements after DCT is also $ 6 \ times 6 $. The number of reduction curves is 36.

Training data for reduced curve learning

Patches were extracted from the $ 200 \ times 200 $ area of lena and used as training data. Standardized by subtracting 127 and dividing by 128.

result

The curve of each cell represents the reduction curve for each DCT coefficient.

Global reduction curve learning

F_{global}(S) = ||y_{0} - \frac{1}{n}\Sigma_{k=1}^{M}R_{k}^{T}AS\\{A^{T}p_{k}\\}||^{2}_{2}

Find the parameters of the reduction curve that minimizes. $ R_ {k} $ is the operator that extracts the kth patch from the image. The slope of the reduction curve is almost 0, but it seems to be usable for the time being ... Since the DC component becomes 0, it is scaled in post-processing. (Implementation may be strange ...)

OMP noise removal with redundant DCT dictionary

An 8x8 patch was extracted from the image. The patch was redundantly DCT transformed and sparse-encoded by OMP. The average of the patch overlaps was taken. We took a noisy image and a weighted average.

The number of non-zero elements in the sparse representation obtained by OMP is $ k_0 = 4 $. OMP tolerance $ \ epsilon = 8 ^ 2 \ times 20 ^ 2 \ times 1.15 $. A weighted average was taken with a weight of 0.5 for the noisy image and a weight of 1 for the noisy image.

Redundant DCT dictionary Convert a $ 8 \ times 8 $ patch to a $ 16 \ times 16 $ component

OMP noise removal with K-SVD dictionary

A patch was extracted from the noisy image, and a dictionary was obtained by K-SVD. Using the obtained dictionary, processing was performed in the same manner as above.

K-SVD dictionary

NL-means Buades et al.'S famous NL-means From the perspective of dictionary learning, NL-means can be seen as extreme dictionary learning, with different dictionaries for each pixel.

Consider a search window centered on the pixel of interest. Think of a set of patches centered on each pixel in the search window as a dictionary. The coefficient of each atom is calculated based on the square error with the patch centered on the pixel of interest (patch of interest). This is a close expression of the patch of interest.

From this point of view, dictionary learning and NL-means can be improved respectively.

BM3D The famous BM3D by Dabov et al. It is the strongest against Gaussian noise. BM3D can also be seen from the perspective of dictionary learning.

By block matching (BM), patches similar to the patch of interest are collected in the search window and stacked to form a 3D patch. 3D patch is transformed (wavelet, DCT, etc.), hard threshold processing, Wiener reduction, and noise removal (collaborative filtering).

It leads to structured dictionary learning and a combination of clustering and dictionary learning.

Summary

NL-means and BM3D had higher performance than K-SVD dictionary + OMP noise removal for the calculation time.
Dictionary learning and sparse coding are interesting in theory.
From the idea of NL-means pixel-by-pixel local dictionary and BM3D collaborative filtering, the combination of clustering and dictionary learning is being studied.
Although there is a problem of calculation time, will the application of the ideas of NL-means and BM3D to dictionary learning and sparse coding exceed BM3D in the future?

[PYTHON] Image denoising

Overview

result

Method

Test image

Wavelet reduction

Duplicate patch-based DCT reduction

Reduction curve learning

Training data for reduced curve learning

result

Global reduction curve learning

OMP noise removal with redundant DCT dictionary

OMP noise removal with K-SVD dictionary

Summary

reference