Try image processing with Python when asked for entertainment at a wedding ceremony

motivation

Image processing is a technology that can be used in various places. Even if it's a wedding entertainment. When I'm in my late twenties, my surroundings are married, so I'm conscious of it (sweat)

couple_afin.jpg

[Wedding / Wedding Table Set]( https://www.pakutaso.com/20130406108post-2650.html" Wedding / wedding table set ")

And weddings are sideshows

I want to improve quality even when I don't have time ... It will meet such needs. That's right for image processing.

The scope of this article covers "mapping between images" (corresponding to Practical Computer Vision Chapter 3). If you are an advanced image processor, please point out any mistakes.

Processing required for mapping between images

When preparing a photo for entertainment, it is necessary to compose the photo.

If you simply combine the photos, the photo will look strange.

The processing required there is as follows.

・ Image deformation ・ Embed the image well in the image ·Alignment ・ Alignment for panorama

Image processing technology required for each process

The elements required for image conversion processing are roughly divided as follows.

Screen Shot 2016-02-04 at 18.14.31.png

EECS 442 – Computer vision Conversation hour

・ Image scaling -Image scaling (asymmetric) ·rotation ・ Please see the figure on the far right as it cannot be explained in words.

The image is converted by combining these processes.

Here are some formulas that are important here.


\begin{pmatrix}
x'  \\
y'  \\
w'
\end{pmatrix}

=

\begin{pmatrix}
h_{11} & h_{21} & h_{31}\\
h_{12} & h_{22} & h_{32}\\
h_{13} & h_{23} & h_{33}
\end{pmatrix}


\begin{pmatrix}
x  \\
y  \\
w
\end{pmatrix}
\begin{matrix}
\boldsymbol{x'} = \boldsymbol{H}\cdot{\boldsymbol{x}}
\end{matrix}

The above equation shows that when a point is given as a vector, it is mapped to another space by multiplying it by a matrix that transforms that point. It doesn't make sense in words, so I'll put a figure below.

Screen Shot 2016-02-04 at 18.45.46.png

Lecture 16: Planar Homographies

It shows how it looks on the Pinhole camera when looking into an object from the Pinhole camera, and it shows that it is converted by multiplying the matrix H when expressing it as a plane image from the Pinhole camera.

Matrix H covers the basic image conversion processing described at the beginning. By finding the exhaustive matrix H, you can represent what it looks like on a Pinhole camera when you look into an object from a Pinhole camera, or you can represent an image on a plane from a Pinhole camera. This matrix H is called the homography matrix.

I will summarize it here.

Image conversion processing is required to create a composite photo. The image conversion process can be performed by using a homography matrix. By using the homography matrix, it is possible to reproduce an image that is diagonally moved from a plane or an image that is diagonally moved from a plane.

Homography matrix

Here we describe how to find the homography matrix.

Uses the Direct Linear Transfer algorithm.

Now let's check the formula again.


\begin{pmatrix}
x'  \\
y'  \\
w'
\end{pmatrix}

=

\begin{pmatrix}
h_{11} & h_{21} & h_{31}\\
h_{12} & h_{22} & h_{32}\\
h_{13} & h_{23} & h_{33}
\end{pmatrix}


\begin{pmatrix}
x  \\
y  \\
w
\end{pmatrix}

Since this is an image, it represents two-dimensional coordinates, but it uses coordinates called a simultaneous coordinate system, and there are w values in addition to x and y values.

The value of w is usually normalized to 1, but the effect of having this value is that it can also represent translation.

In image conversion, not only the above four but also parallel movements in which the positions change can be expressed. This is an important point.

See below for an easy-to-understand illustration.

http://d.hatena.ne.jp/Zellij/20120523/p1

Next, let's check the normalized pattern. What we want to find from this formula is a homography matrix that turns a flat image into a diagonal image and a diagonal image into a flat image.


\begin{pmatrix}
x'  \\
y'  \\
1'
\end{pmatrix}

=

\begin{pmatrix}
h_{11} & h_{21} & h_{31}\\
h_{12} & h_{22} & h_{32}\\
h_{13} & h_{23} & h_{33}
\end{pmatrix}


\begin{pmatrix}
x  \\
y  \\
1
\end{pmatrix}

When the relationship between the points of the original image and the mapped image is expressed by a mathematical formula

  x' = \frac{h_{11}x + h_{21}y + h_{31}}{h_{13}x + h_{23}y + h_{33}}\\

  y' = \frac{h_{12}x + h_{22}y + h_{32}}{h_{13}x + h_{23}y + h_{33}}

To find the matrix H, we need to find a 9-dimensional value, but here we add a constraint. Constraint the value of h33 to 1. Why is it okay to put restrictions here?

  x' = \frac{kh_{11}x + kh_{21}y + kh_{31}}{kh_{13}x + kh_{23}y + kh_{33}}\\

  y' = \frac{kh_{12}x + kh_{22}y + kh_{32}}{kh_{13}x + kh_{23}y + kh_{33}}\\
\\

When h33 is set to 1

  x' = \frac{h_{11}x + h_{21}y + h_{31}}{h_{13}x + h_{23}y + 1}\\

  y' = \frac{h_{12}x + h_{22}y + h_{32}}{h_{13}x + h_{23}y + 1}

Even if you calculate the value multiplied by k, it is practically meaningless because it is divided by the denominator and numerator, so h33 is used in both x and y. Here, it is safe to assume that k is a value that sets h33 to 1. Therefore, we were able to change from the problem of finding the 9-dimensional value to the problem of finding the 8-dimensional value by limiting the value of h33 to 1.

Direct solution of linear transformation method from here


  (h_{13}x + h_{23}y + 1)x' = h_{11}x + h_{21}y + h_{31}\\

  ({h_{13}x + h_{23}y + 1})y' = h_{12}x + h_{22}y + h_{32}

Asked x', y'


  x' = h_{11}x + h_{21}y + h_{31} - h_{13}xx' - h_{23}yx'  \\

  y' = h_{12}x + h_{22}y + h_{32} - h_{13}xy' - h_{23}yy' 

It turned out that two mathematical formulas can be obtained from the mapped one-point data. In other words, mathematically, it was found that if four points are obtained, the simultaneous equations required to obtain eight values can be obtained. Here, I will express the formula when 4 points are obtained.


\begin{pmatrix}
x'_{1}  \\
y'_{1}  \\
x'_{2}  \\
y'_{2}  \\
x'_{3}  \\
y'_{3}  \\
x'_{4}  \\
y'_{4}  
\end{pmatrix}

=

\begin{pmatrix}
x_{1} & y_{1} & 1 & 0 & 0 & 0 & -x_{1}x'_{1} & -y_{1}x'_{1}\\
0 & 0 & 0 & x_{1} & y_{1} & 1 & -x_{1}y'_{1} & -y_{1}y'_{1}\\
x_{2} & y_{2} & 1 & 0 & 0 & 0 & -x_{2}x'_{2} & -y_{2}x'_{2}\\
0 & 0 & 0 & x_{2} & y_{2} & 1 & -x_{2}y'_{2} & -y_{2}y'_{2}\\
x_{3} & y_{3} & 1 & 0 & 0 & 0 & -x_{3}x'_{3} & -y_{3}x'_{3}\\
0 & 0 & 0 & x_{3} & y_{3} & 1 & -x_{3}y'_{3} & -y_{3}y'_{3}\\
x_{4} & y_{4} & 1 & 0 & 0 & 0 & -x_{4}x'_{4} & -y_{4}x'_{4}\\
0 & 0 & 0 & x_{4} & y_{4} & 1 & -x_{4}y'_{4} & -y_{4}y'_{4}\\
\end{pmatrix}


\begin{pmatrix}
h_{11}  \\
h_{12}  \\
h_{13}  \\
h_{21}  \\
h_{22}  \\
h_{23}  \\
h_{31}  \\
h_{32}
 \end{pmatrix}

It is also possible to find the above simultaneous equations analytically. Normally, it is obvious that the reproducibility will be higher if you reproduce by taking 4 points or more.

It is a solution when the number of points increases, but it is a solution by the least squares method using SVD. The last row of the matrix V of the solution obtained by SVD is the solution by the least squares method this time. (The point cloud uses the values observed from the space normalized by the mean 0 and the variance 1.)

If you would like to see why this is the case, please see below.

Singular value decomposition

To summarize here

・ The homography matrix is a matrix that moves an image diagonally from a plane and from an oblique to a plane. ・ Matrix can be calculated by taking 4 points ・ Use SVD when calculating with 4 points or more.

Affine transformation

From here, we will introduce the affine transformation that can take advantage of the three points required instead of limiting the problems that can be solved by simplifying the homography matrix.


\begin{pmatrix}
x'  \\
y'  \\
1
\end{pmatrix}

=

\begin{pmatrix}
a_{1} & a_{2} & t_{1}\\
a_{3} & a_{4} & t_{2}\\
0 & 0 & 1
\end{pmatrix}


\begin{pmatrix}
x  \\
y  \\
1
\end{pmatrix}

What the above formula means is scaling 1. In other words, the amount of conversion is suppressed.

You can convert from a quadrangle to a parallelogram, enlarge, reduce, and translate, but you cannot convert from a quadrangle to a trapezoid.

  x' = \frac{h_{11}x + h_{21}y + h_{31}}{h_{13}x + h_{23}y + 1}\\

  y' = \frac{h_{12}x + h_{22}y + h_{32}}{h_{13}x + h_{23}y + 1}

If you check the above formula, you can convert the values of x'and y'in common according to the values of x and y. In other words, it was possible to change the size of the entire value depending on the position of the coordinates, but since it changed as follows, it can only handle simple conversion and translation. Instead, you only need 3 points. The advantage of this is that split affine warping can be used.

  x' = h_{11}x + h_{21}y + h_{31}\\

  y' = h_{12}x + h_{22}y + h_{32}

Embed an image in another image

Use split affine warping to embed an image in another image.

When you want to match two images as shown below, you usually take a point cloud and sample the points in the homography matrix to calculate, but it is difficult to match the points of the image due to calculation error etc. Become.

However, the affine transformation using the Delaunay triangulation method makes it possible to match the vertices of the image.

Screen Shot 2016-02-04 at 21.39.02.png

Then, I will explain why it is possible to match the vertices of an image by using the affine transformation using the Delaunay triangulation method.

Only 3 points are important in affine transformation.

And the Delaunay triangle division method is a method of connecting a certain point cloud with a triangle that has the maximum minimum angle when a certain point cloud is obtained. See below for how to choose the maximum angle of the triangle.

Delaunay Triangle Division

An example is shown below.

Screen Shot 2016-02-04 at 21.44.21.png

Programming Computer Vision with Python

In other words, since affine transformation is applied to all points, it is possible to perform transformation that matches the vertices. The calculation method is the DLT method, and the affine transformation is calculated in the same way.

Summary

・ Affine transformation has fewer variations than homography transformation ・ Instead, it can be converted with only 3 points ・ If 3 points are sufficient, the Delaunay triangle division method can be used, so conversion with vertices matched is possible.

Image alignment

In the case of image alignment, similarity transformation is used because the image itself uses a similar image.


\begin{pmatrix}
x'  \\
y'  \\
1
\end{pmatrix}

=

\begin{pmatrix}
s\ cos(\theta) & -s\ sin(\theta) & t_{x}\\
s\ sin(\theta) & s\ cos(\theta) & t_{y}\\
0 & 0 & 1
\end{pmatrix}


\begin{pmatrix}
x  \\
y  \\
1
\end{pmatrix}

s is the magnification and θ is the turnover. Since it can be scaled and rotated over the entire coordinates, it is the best conversion for alignment. Let's compare the average of the images below with the simple average. It can be confirmed that the image can be reproduced more accurately by aligning.

Screen Shot 2016-02-04 at 22.15.25.png

Programming Computer Vision with Python

Screen Shot 2016-02-04 at 22.13.58.png

Programming Computer Vision with Python

RANSAC

Only the DLT method is introduced, but since the DLT method is not robust to noise, another method has been proposed. This technique is a time-consuming algorithm instead of being robust against noise.

Please refer to the following materials for specific methods.

Preemptive RANSAC by David Nister.

Image stitching

There is an image taken from the same place. If you want to join the images together as shown in the second figure

Screen Shot 2016-02-04 at 22.36.12.png

Programming Computer Vision with Python

Screen Shot 2016-02-04 at 22.34.33.png

Programming Computer Vision with Python

1: Acquire the feature points of the image by SHIFT (see below for details) Gold needle for when it becomes a stone by looking at the formula of image processing 2: Calculate the homography matrix between images using RANSAC 3: Select the center image 4: Add 0 to the left and right of the center image to join the deformed images. 5: It is determined whether to connect to the left or right by the parallel component.

Since the difference between images is noticeable, commercially available software uses a process to normalize and smooth the brightness.

ipython notebook

Only affine transformation, but it is written in code.

https://github.com/SnowMasaya/Image_Processing_for_Wedding

reference

Practical computer vision Draft of this article

Pennsylvania State University Computer Science It is easy to understand because the lecture about image conversion from 3D is described and it is not limited to 2D as in this article.

University of Michigan

University of Michigan Summary

Since the material is from the basics of image conversion, the basics can be suppressed.

Preemptive RANSAC by David Nister.

An excellent slide to quickly understand the RANSAC algorithm

Re-learning after becoming an adult: Affine transformation

Easy-to-understand explanation of affine transformation

Recommended Posts

Try image processing with Python when asked for entertainment at a wedding ceremony
Image Processing with Python Environment Setup for Windows
Image processing with Python
Image processing with Python (Part 2)
Image processing with Python (Part 1)
Image processing with Python (Part 3)
[Python] Image processing with scikit-image
Try a similar search for Image Search using the Python SDK [Search]
Personal notes for python image processing
Image processing with Python 100 knocks # 3 Binarization
Image processing with Python 100 knocks # 2 Grayscale
Try to extract a character string from an image with Python3
A memo for when pip3 is installed with python2.7 for some reason
Gold needle for when it becomes a stone by looking at the formula of image processing
Basics of binarized image processing with Python
Create a dummy image with Python + PIL.
Image processing with Python 100 knocks # 8 Max pooling
Image processing with Python & OpenCV [Tone Curve]
Image processing with Python 100 knock # 12 motion filter
Try HTML scraping with a Python library
Drawing with Matrix-Reinventor of Python Image Processing-
Easy image processing in Python with Pillow
Image processing with Python 100 knocks # 7 Average pooling
Try drawing a map with python + cartopy 0.18.0
Light image processing with Python x OpenCV
[For beginners] Try web scraping with Python
Image processing with Python 100 knocks # 9 Gaussian filter
Try to beautify with Talking Head Anime from a Single Image [python preparation]
Turn multiple lists with a for statement at the same time in Python
3. Natural language processing with Python 3-3. A year of corona looking back at TF-IDF
Error when installing a module with Python pip
Get a ticket for a theme park with python
Image processing from scratch with python (5) Fourier transform
Create a LINE BOT with Minette for Python
Try to draw a life curve with python
Procedure for creating a LineBot made with Python
Try to make a "cryptanalysis" cipher with Python
Image processing from scratch with python (4) Contour extraction
Image processing? The story of starting Python for
A memo when creating a python environment with miniconda
Commands for creating a python3 environment with virtualenv
Try to make a dihedral group with Python
python image processing
Image processing with Python (I tried binarizing it into a mosaic art of 0 and 1)
I learned Python with a beautiful girl at Paiza # 02
Try to make a command standby tool with python
I learned Python with a beautiful girl at Paiza # 01
Notes on HDR and RAW image processing with Python
Build a python environment for each directory with pyenv-virtualenv
Try searching for a million character profile in Python
Try embedding Python in a C ++ program with pybind11
Problems when creating a csv-json conversion tool with python
3. Natural language processing with Python 4-1. Analysis for words with KWIC
Building an environment for natural language processing with Python
Create a Layer for AWS Lambda Python with Docker
Try scraping with Python.
Image processing with MyHDL
First Python image processing
Image Processing with PIL
Consider common pre-processing when processing DynamoDB Stream with Lambda (Python)
Try running python in a Django environment created with pipenv