[PYTHON] Low-rank approximation of images by HOSVD and HOOI

Introduction

Decomposing a tensor into a set of low-ranked tensors (core tensors) and matrices is called Tucker decomposition. Among the methods, HOSVD (higher order singular value decomposition) and HOOI (higher orthogonal iteration of tensors) are used to perform low-rank approximation of images.

The source code is https://github.com/kaityo256/hooi_sample At.

Tucker decomposition of tensor

For example, suppose you have a third-order tensor $ X $, and the dimensions of each foot are $ R_1, R_2, R_3 $. At this time, the third-order tensor $ \ chi $ whose foot dimensions are $ r_1, r_2, r_3 $ and the matrix $ A_i (i = 1,2,3) $ in the $ r_i $ row $ R_i $ column. Disassemble into. This matrix $ A_i $ changes its dimension from $ r_i $ to $ R_i $ when it is contracted with the $ i $ th bar of the core tensor, that is, $ A_i $ returns from the core tensor to the original tensor. It becomes a matrix. Conversely, if we take the $ i $ th foot of the original tensor and the contraction of $ A_i ^ \ dagger $, the dimension of the foot changes from $ R_i $ to $ r_i $, that is, $ A_i ^ \ dagger $ is projected. It is a matrix. It looks like this when each is illustrated.

Kobito.yHAggJ.png

First, the core tensor $ \ chi $ is obtained by multiplying each leg of the original tensor $ X $ by a projection matrix.

Kobito.w5cchh.png

Also, the result of multiplying each leg of the core tensor $ \ chi $ by the reconstruction matrix is the tensor $ X'$, which is an approximation of the original tensor.

Kobito.gOMlXz.png

Originally, the amount of information in the tensor was $ R_0 R_1 R_2 $, but since the number of elements in the core tensor is $ r_0 r_1 r_2 $ and the number of elements in the $ A_i $ matrix is $ R_i r_i $, the number of elements after compression is It becomes $ r_0 r_1 r_2 + R_0 r_0 + R_1 r_1 + R_2 r_2 $.

The Tucker decomposition is the problem of finding the set $ A_i $ of the projection matrix that approximates the original tensor as accurately as possible given the dimensions of the original tensor $ X $ and the core tensor. Let's do this with two methods, HOSVD and HOOI, based on the low-rank approximation of the image.

HOSVD

In HOSVD (higher order singular value decomposition), the tensor with many legs is first divided into two legs, the one to pay attention to and the other to. Then, since this is a matrix, SVD (singular value decomposition) can be applied. The one that has taken the upper rank of the matrix on the right side after SVD is the restoration matrix of the foot to be noted. HOSVD is a procedure that repeats this for all legs.

Kobito.luJJ1i.png

In the figure above, we focus on the first leg of the three-legged tensor $ X $, and SVD the second and third legs together as a matrix of $ R_2 R_3 \ times R_1 $. .. Then, a square matrix of $ R_2 R_3 \ times R_2 R_3 $ appears on the left side and $ R_1 \ times R_1 $ appears on the right side, but the one that takes the upper $ r_1 $ row on the right side is the restoration matrix $ A_1 $ of the first bar. Become. Similarly, $ A_2 $ and $ A_3 $ can be obtained.

HOOI

Now, it is known that HOSVD does not give the best approximation. HOOI (higher orthogonal iteration of tensors) improves this by iteration.

First, prepare the restoration matrix $ A_i $ randomly at first. Then, except for the foot of interest (for example, No. 1), crush the foot of the original tensor with $ A_i (i \ neq 1) $. Then, like HOSVD, the matrix of the legs of interest and the other legs is multiplied by SVD, and the upper $ r_1 $ of the resulting matrix is taken as the new $ A_1 $.

Kobito.UJm2g3.png

Note that the amount of calculation required for SVD has dropped significantly because the dimensions have been dropped using the projection matrix except for the foot of interest.

This is executed for all legs as one loop, and the accuracy is improved by turning this loop.

Low-rank approximation of the image

A two-dimensional image can be regarded as a third-order tensor with dimensions (height, width, 3) because the pixel value is determined by specifying the x-coordinate, y-coordinate, and color (RGB). Of these, the matrix that crushes the height and width is obtained by HOSVD and HOOI, and the image is low-rank approximated. For HOSVD, see Past Articles.

For an image with a width of $ w $ and a height of $ h $, consider crushing the width dimension to $ r1 $ and the height dimension to $ r2 $. What we want to find is a projection matrix that crushes the width and height. Let this be $ a_1 ^ \ dagger, a_2 ^ \ dagger $. As for the color, there are only 3 dimensions, so I will not crush it.

First, it is necessary to prepare the initial values of $ a_1 and a_2 $. The initial value can be anything [^ 1], but each line must be orthogonal. I couldn't think of a good way to give it, so I prepared a random matrix of $ w \ times w $ and $ h \ times h $, SVD it, and fetched the top $ r_1 and r_2 $ rows. I tried it as a value.

[^ 1]: For the purpose of improving the approximation of HOSVD, it is effective to use the result of HOSVD as the initial value. However, if you want to lower the order of calculation than HOSVD, you can prepare the initial value randomly. In any case, iterates several times to obtain sufficient accuracy.

    X1 = np.random.rand(w,w)
    X2 = np.random.rand(h,h)
    U,s,A1 = linalg.svd(X1)
    U,s,A2 = linalg.svd(X2)
    a1 = A1[:r1, :]
    a2 = A2[:r2, :]

For your convenience later, let's prepare a function that gives a compressed and restored tensor using $ a_1 and a_2 $.

def restored_tensor(X,a1,a2):
    pa1 = a1.T.dot(a1)
    pa2 = a2.T.dot(a2)
    X2 = np.tensordot(X,pa1,(1,0))
    X3 = np.tensordot(X2,pa2,(0,0))
    return X3.transpose(2,1,0)

This is a function that multiplies $ X $ by $ a_i ^ \ dagger $ to create a core tensor, and then multiplies $ a_i $ to return the restored tensor.

After that

  1. Crush the height foot (No. 2) with $ a_2 ^ \ dagger $, then SVD it together with the color, and update the matrix $ a_1 ^ \ dagger $ that crushes the width.
  2. Crush the width leg (No. 1) with $ a_1 ^ \ dagger $, then SVD it together with the color, and update the height-crushing matrix $ a_2 ^ \ dagger $.

Should be repeated. The following code implements it.

    for i in range(10):
    	X1 = np.tensordot(X,a2.T,(0,0)).transpose(2,1,0).reshape(r2*3,w)
    	U,s,A1 = linalg.svd(X1)
    	a1 = A1[:r1, :]
    	X2 = np.tensordot(X,a1.T,(1,0)).transpose(2,1,0).reshape(r1*3,h)
    	U,s,A2 = linalg.svd(X2)
    	a2 = A2[:r2, :]

result

For the original tensor $ X $, given its approximate tensor $ X'$, the residual $ r $

r = \frac{|X - X'|}{|X|}

Defined in. However|X|IsXFrobenius norm.

Consider a transformation that takes a suitable image as input and compresses each of the width and height dimensions to 20%. As an input image, I use a photo of the Itanium2 spirit horse that I happened to have at hand.

uma.jpg

The following image is an approximation of this with HOSVD.

uma_hosvd.jpg

The residual was 0.966415890942.

When HOOI is turned 10 steps, the residuals decrease as follows.

1.28857987024
0.97532217632
0.95714093422
0.952636586271
0.950770629606
0.949809071863
0.949257702502
0.948920613334
0.948705104294
0.948562468306

The first line is the value without any optimization. If you turn the loop twice from there, the residual will be smaller than that of HOSVD. The image after turning 10 times is as follows.

uma_hooi.jpg

The residual in this image is 0.948562468306, a little less than 2% improvement over HOSVD, but the difference is hard to distinguish visually (at least to me).

Summary

I thought the image was a tensor on the 3rd floor, and tried low-rank approximation using two methods, HOSVD and HOOI. For an image with a size of $ (w, h) $, if you crush the width to $ r_1 $ and the height to $ r_2 $, HOSVD has two, $ (3h, w) $ and $ (3w, h) $. You need to SVD the matrix once, but HOOI needs to loop several times instead of SVDing a fairly small matrix with $ (3r_2, w) $ and $ (3r_1, w) $. .. However, in the range tested in the image, the residual became smaller than that of HOSVD after turning it several times, so if the dimension of the original tensor is large, the total amount of calculation seems to be smaller in HOOI.

Recommended Posts

Low-rank approximation of images by HOSVD and HOOI
Low-rank approximation of images by Tucker decomposition
Low-rank approximation of images by singular value decomposition
Story of power approximation by Python
[Python] Implementation of Nelder–Mead method and saving of GIF images by matplotlib
Face detection by collecting images of Angers.
Automatically generate images of koalas and bears
Reconstruction of moving images by Autoencoder using 3D-CNN
Classification of guitar images by machine learning Part 1
Calculation of technical indicators by TA-Lib and pandas
Noise removal and background transparency of binary images
Parallel learning of deep learning by Keras and Kubernetes
Classification of guitar images by machine learning Part 2
Wavelet transform of images with PyWavelets and OpenCV
Split Python images and arrange them side by side
Analysis of financial data by pandas and its visualization (2)
Display embedded images of mp3 and flac with mutagen
Analysis of financial data by pandas and its visualization (1)
Optical Flow, the dynamics of images captured by OpenCV
I compared the identity of the images by Hu moment
Create a batch of images and inflate with ImageDataGenerator
Search and save images of Chino Kafu from Twitter
Implementation of DB administrator screen by Flask-Admin and Flask-Login
Visualization method of data by explanatory variable and objective variable
Collection and automation of erotic images using deep learning