Sparse modeling Chapter 3 Implemented the algorithm of the tracking algorithm in Python.

Jupyter notebook summarizing the code and experimental results.

Implemented the following greedy algorithm and convex relaxation method. IRLS is a little suspicious ...

Greedy algorithm

Orthogonal matching pursuit (OMP)
Matching pursuit (MP)
Weak matching pursuit (WMP)
Threshold algorithm

Convex relaxation method

Iterative-reweighted-lease-sqaures (IRLS)

Overview of greedy algorithm

Initialization

As $ k = 0 $

Initial solution $ \ mathbf {x} ^ {0} = \ mathbf {0} $
Initial residual $ \ mathbf {r} ^ {0} = \ mathbf {b} $
Initial support for solution $ S ^ {0} = \ emptyset $

Main loop

Perform the following steps as $ k \ leftarrow k + 1 $.

Error calculation, support update: Column $ \ mathbf {a} _ {j} $ that can reduce the most residual $ \ mathbf {r} ^ {k} $ from $ \ mathbf {A} $ To $ S ^ k $ 1.Provisional World Update:\\| \mathbf{Ax}-\mathbf{b}\\|^{2}_{2} \, \rm{subject} \, \rm{to} \, \rm{Support}\\{\mathbf{x}\\}=S^{k}Optimal solution of\mathbf{x}^{k}Let.
Update the residuals: Calculate $ \ mathbf {r} ^ {k} = \ mathbf {b}-\ mathbf {Ax} ^ {k} . 1.Stop condition: If\|\mathbf{r}^{k}\|<\epsilon_{0}$If so, it ends, otherwise it repeats.

In error calculation and support update, the inner product of all columns $ \ mathbf {a} _ {j} $ of $ \ mathbf {A} $ and the residual $ \ mathbf {r} $ is calculated, and the absolute value of the inner product is calculated. Add the column that maximizes to the support. In other words, columns parallel to $ \ mathbf {r} $ are added in order.

Differences in greedy algorithm

OMP
Stay on top
MP
Updated tentative solution only with newly added $ \ mathbf {a} _j $ instead of $ \ mathbf {A} $
WMP
Add scalar $ t , (0 <t <1) $ to hyperparameters *With error calculations and support updates\mathbf{r}^{k-1}The absolute value of the inner product witht\\|\mathbf{r}^{k-1}\\|More than\mathbf{a}_jIf is found, the calculation is terminated. The rest is the same as MP.
Threshold algorithm
Added $ k $ number of basis functions included in support to hyperparameters
$ k $ from the top of $ \ mathbf {a} _j $, which has a large absolute value of the inner product with $ \ mathbf {b} $, is supported. After that, solve the least squares problem and set it as $ \ mathbf {x} $.

IRLS overview

Solve the $ \ left (P_ {1} \ right) $ problem instead of the $ \ left (P_ {0} \ right) $ problem.
Approximate the $ L_ {1} $ norm with the weighted $ L_ {2} $ norm to solve the $ \ left (P_ {1} \ right) $ problem.
It takes a lot of repetition to get a sparse solution.

Initialization

As $ k = 0 $

Initial approximate solution $ \ mathbf {x} ^ {0} = \ mathbf {1} $ (all 1 vectors)
Initial weight matrix $ \ mathbf {X} ^ {0} = \ mathbf {I} $

Main loop

Perform the following steps as $ k \ leftarrow k + 1 $.

Least Squares: Linear simultaneous equations $ \ mathbf {x} \ _ {k} = \ mathbf {X} \ _ {k-1} ^ {2} \ mathbf {A} ^ {T} (\ mathbf Solve and approximate {A} \ mathbf {X} _ {k-1} ^ {2} \ mathbf {A} ^ {T}) ^ {+} \ mathbf {b} $ directly or using iterative methods Get the solution $ \ mathbf {x} \ k . 1.Weight update:X{k}(j,j)=|x_{k}(j)|^{0.5}As a weighted diagonal matrix\mathbf{X}Update 1.Stop condition: If\|\mathbf{x}_{k}-\mathbf{x}_{k-1}\|_{2}$If is less than the pre-given threshold, it ends, otherwise it repeats.

Experiment

simulation

$ \ mathbf {x} $ 50 dimensional vector $ \ mathbf {b} $ 30 dimensional vector $ \ mathbf {A} $ 30 × 50 dimensional matrix, column normalization of uniform random numbers of $ [-1, 1) $

$ \ mathbf {x} $ has a non-zero number of elements $ k $, and the value of the non-zero element is iid from a uniform distribution in the range of $ [-2, -1] \ cup [1, 2] $. Extract.
Simulate 200 times each with $ k = 1, ..., 10 $

index

The $ L_2 $ norm of the estimated $ \ mathbf {\ hat {x}} $ and the true $ \ mathbf {x} $ *Estimated\mathbf{\hat{x}}And true\mathbf{x}Distance between supports$dist(\hat{S},S)=\frac{max\\{|\hat{S}|, |S| \\} - |\hat{S}\cap S|}{max\\{|\hat{S}|,|S|\\}}$
Take the average of 200 simulations.

result

Average $ L_ {2} $ error

As k increases, the error increases.
The error of IRLS is small, but it is natural because it is $ L_ {2} $ error
OMP is also doing its best. It's fast

Average distance between supports

When k is low, the distance between IRLS supports has increased. There is no such thing in the textbook, so it may be insufficient or defective.
OMP is also doing its best.

Consideration

OMP I'm doing my best
IRLS does not match the results of the textbook, so it is necessary to investigate the cause.

[PYTHON] Sparse modeling: Chapter 3 Tracking algorithm

Greedy algorithm

Convex relaxation method

Overview of greedy algorithm

Initialization

Main loop

Differences in greedy algorithm

IRLS overview

Initialization

Main loop

Experiment

simulation

index

result

Consideration