Implemented Chapter 14 "Image Noise Removal" for sparse modeling.
Was compared.
notebook ch14-01.ipynb ch14-02.ipynb ch14-03.ipynb ch14-04.ipynb ch14-05.ipynb
The numbers are [peak signal to noise ratio (PSNR)](https://ja.wikipedia.org/wiki/peak signal to noise ratio) [db]
K-SVD Although it takes a lot of calculation time, NL-means , BM3D was stronger and overgrown. Maybe it's implemented ...
Gaussian noise with σ = 20 was added to Barbara and used as a test image. Noise was removed by each method.
Wavelet transform was performed and hard threshold processing was performed. Performance changed depending on the threshold.
An 8x8 patch was extracted from the image. The patch was DCT transformed and hard thresholded. The average of the patch overlaps was taken. Performance changed depending on the threshold.
The threshold processing can be regarded as a curve showing the relationship between the input value and the output value. By polynomial fitting, the optimum reduction curve was learned from a pair of patches with and without noise.
S is threshold processing, $ A ^ {T} $ is DCT transform, $ p ^ {0} $ is noise-free patch, and M is the total number of training data. The parameter of S (polynomial coefficient) that minimizes $ F_ {local} $ was calculated by the least squares method.
Learn the reduction curve for each element of the DCT. The patch size was $ 6 \ times 6 $. Since the non-redundant DCT is used, the number of elements after DCT is also $ 6 \ times 6 $. The number of reduction curves is 36.
Patches were extracted from the $ 200 \ times 200 $ area of lena and used as training data. Standardized by subtracting 127 and dividing by 128.
The curve of each cell represents the reduction curve for each DCT coefficient.
Find the parameters of the reduction curve that minimizes. $ R_ {k} $ is the operator that extracts the kth patch from the image. The slope of the reduction curve is almost 0, but it seems to be usable for the time being ... Since the DC component becomes 0, it is scaled in post-processing. (Implementation may be strange ...)
An 8x8 patch was extracted from the image. The patch was redundantly DCT transformed and sparse-encoded by OMP. The average of the patch overlaps was taken. We took a noisy image and a weighted average.
The number of non-zero elements in the sparse representation obtained by OMP is $ k_0 = 4 $. OMP tolerance $ \ epsilon = 8 ^ 2 \ times 20 ^ 2 \ times 1.15 $. A weighted average was taken with a weight of 0.5 for the noisy image and a weight of 1 for the noisy image.
Redundant DCT dictionary Convert a $ 8 \ times 8 $ patch to a $ 16 \ times 16 $ component
A patch was extracted from the noisy image, and a dictionary was obtained by K-SVD. Using the obtained dictionary, processing was performed in the same manner as above.
K-SVD dictionary
NL-means Buades et al.'S famous NL-means From the perspective of dictionary learning, NL-means can be seen as extreme dictionary learning, with different dictionaries for each pixel.
Consider a search window centered on the pixel of interest. Think of a set of patches centered on each pixel in the search window as a dictionary. The coefficient of each atom is calculated based on the square error with the patch centered on the pixel of interest (patch of interest). This is a close expression of the patch of interest.
From this point of view, dictionary learning and NL-means can be improved respectively.
BM3D The famous BM3D by Dabov et al. It is the strongest against Gaussian noise. BM3D can also be seen from the perspective of dictionary learning.
By block matching (BM), patches similar to the patch of interest are collected in the search window and stacked to form a 3D patch. 3D patch is transformed (wavelet, DCT, etc.), hard threshold processing, Wiener reduction, and noise removal (collaborative filtering).
It leads to structured dictionary learning and a combination of clustering and dictionary learning.
Recommended Posts