[PYTHON] Transform the image by projective transformation-Hack the monitor screen-


My previous article

-Rotate sprites with OpenCV -Rotate sprites with OpenCV # 2 ~ Master cv2.warpAffine () ~

So, I used the affine transformation only for rotation, so this time I will show you another way to use it and play with the projection transformation.

Transform a parallelogram with affine transformation

Affine matrix

x' \\
y' \\
a & b & c \\
d & e & f \\
0 & 0 & 1
x \\
y \\

The origin (0,0) is (c, f), and the unit vectors (1,0) and (0,1) are (a, d) and (b, respectively. , e) , which means that the xy plane changes to the x'y' plane. It means that any parallelogram consisting of two vectors transforms into another parallelogram, and more fundamentally, any triangle transforms into another triangle.

Create an affine matrix cv.getAffineTransform (src, dst)

--src The apex of the triangle. Must be a numpy array of float32. --Dst 3 points after the change. Like src, it must be a float32 numpy array.

Three points that define the image size of height 300 and width 200, that is pts1 = [(0,0), (200,0), (0,300)], another 3 points Consider converting to pts2 = [(50,50), (250,100), (200,300)]. Since there are three changes in (x, y)(x', y'), all a, b, c, d, e, f are obtained, and the affine matrix is ​​determined. I haven't calculated it myself, but according to cv.getAffineTransform (), the affine matrix is

[[ 1.          0.5        50.        ]
 [ 0.25        0.83333333 50.        ]]

Will be. The source is omitted. The deformation of the image due to this is as follows. affine_1.png You can see that (0,0) is (50,50), (200,0) is (250,100), and (0,300) is (200,300). .. At this time, (200,300) becomes(400,350)by the rule of the parallelogram. Since it starts from (0,0), please do not say that the vertex is (199,299) instead of (200,300). By the way, the original image is Macho 2 (vertical photo) (Muscle Plus) who challenges the final battle to save the world.


A transformation that transforms an arbitrary quadrangle into another quadrangle with a higher degree of freedom is called a projective transformation. The projection matrix is

x' \\
y' \\
a & b & c \\
d & e & f \\
f & g & 1
x \\
y \\

It is in the form of.

Create a projection matrix cv.getPerspectiveTransform (src, dst)

--src The vertex of the quadrangle. Must be a numpy array of float32. --dst 4 points after change. Like src, it must be a float32 numpy array.

It's the same as the affine transformation cv.getAffineTransform ().

Projection conversion of image cv.warpPerspective (src, M, dsize)

This is also the same as the affine transformation cv.warpAffine ().


Four points of an image with a height of 300 and a width of 200, that is, pts1 = [(0,0), (0,300), (200,300), (200,0)], another 4 points Consider converting to pts2 = [(50,100), (100,400), (300,300), (200,50)]. I don't feel like solving an 8-element simultaneous equation by myself, but according to cv.getPerspectiveTransform (), the projection matrix is

[[ 8.33333333e-01  6.94444444e-02  5.00000000e+01]
 [-2.29166667e-01  6.11111111e-01  1.00000000e+02]
 [ 4.16666667e-04 -9.72222222e-04  1.00000000e+00]]

Will be. And when you look at the change in the image, it looks like this. You can see how each vertex is transformed. perspective1.png


The source is as follows. I am currently using over-spec techniques such as using the anonymous function lambda to find the maximum value of each column from a two-dimensional array, or arranging graphs by sharing axes with matplotlib.pyplot. ..


import cv2
import matplotlib.pyplot as plt
import numpy as np

img1 = cv2.imread("mustle.jpg ")
h1, w1 = img1.shape[:2]

pts2 = [(50,100), (100,400), (300,300), (200,50)]
w2 = max(pts2, key = lambda x:x[0])[0]
h2 = max(pts2, key = lambda x:x[1])[1]

pts1 = np.float32([(0,0), (0,h1), (w1,h1), (w1,0)])
pts2 = np.float32(pts2)

M = cv2.getPerspectiveTransform(pts1,pts2)
img2 = cv2.warpPerspective(img1, M, (w2,h2), borderValue=(255,255,255))
print (M)

fig, (ax1, ax2) = plt.subplots(1, 2, sharex=True, sharey=True)

Application example

Not only can the rectangle be made into an isosceles quadrangle, but elements (paintings, etc.) that have become an isosceles quadrangle when photographed diagonally can be taken out as if they were photographed from the front. Of course, it is necessary to know the size (aspect ratio) of the rectangle you want to take out.

Shuto player VS Tokido player's final stage --CAPCOM Pro Tour 2019 Asian premiere photo material (Pakutaso) Let's play.

Original image sf5.jpg
Extracted image

Take this monitor image and replace it with another image. The image is Macho (Muscle Plus). Trim with the same aspect ratio of 16: 9 as the game screen.

Image prepared in advance macho.jpg

This is transformed into the isosceles quadrangle specified earlier.

Intermediate image

If you combine it with the original image, you will have a scene of fighting in a super-realistic fighting game.

Image of hacked screen

The intermediate image is well made, but the composite result shows red noise here and there. I don't think it's a bad composition because it was the same even if I composited it on a white background or set the transparent color to another color, but I'm not sure why.

Animated GIF of a series of operations


In the previous article "Getting mouse events with OpenCV-Making a GUI concentrated line tool-", global variables were exchanged between the main routine and the callback function, but this time cv2. I tried to make it a little more elegant by using param of setMouseCallback (). As you can see in the animation GIF above, not only the cursor position but also the side of the rectangle is displayed (when deciding the last point, it is a closed rectangle), and you can also right-click to "go back one move". I've implemented it, and I'm happy that it's pretty nice code.


import numpy as np
import cv2
import random

def draw_quadrangle(event, x, y, flags, param):
    img = param["img"]
    pts = param["pts"]
    pic = param["pic"]
    color = (random.randint(0,255), random.randint(0,255), random.randint(0,255))
    img_tmp = img.copy()

    if event == cv2.EVENT_MOUSEMOVE and len(pts) <= 3:
        h, w = img.shape[:2]
        cv2.line(img_tmp, (x,0), (x,h-1), color)
        cv2.line(img_tmp, (0,y), (w-1,y), color)        
        cv2.imshow("image", img_tmp)
    if event == cv2.EVENT_LBUTTONDOWN:
        pts.append((x, y))
        if len(pts) == 4:
            h, w = img.shape[:2]
            ph, pw = pic.shape[:2]
            pts1 = np.float32(pts)
            pts2 = np.float32([[0,0],[0,ph],[pw,ph],[pw,0]])

            M1 = cv2.getPerspectiveTransform(pts1,pts2)
            tmp = cv2.warpPerspective(img,M1,(pw,ph))
            #cv2.imwrite("cut_image.jpg ", tmp)

            M2 = cv2.getPerspectiveTransform(pts2,pts1)
            transparence = (128,128,128)
            front = cv2.warpPerspective(pic, M2, (w,h), borderValue=transparence)
            img = np.where(front==transparence, img, front)
            #cv2.imwrite("front.jpg ", front)
            cv2.imshow("image", img)
            #cv2.imwrite("image.jpg ", img)

    if event == cv2.EVENT_RBUTTONDOWN and len(pts)>0:
        del pts[-1]
    if 0 < len(pts) <= 3:
        for pos in pts:
            cv2.circle(img_tmp, pos, 5, (0,0,255), -1)

        cv2.line(img_tmp, pts[-1], (x,y), color, 1)
        if len(pts)==3:
            cv2.line(img_tmp, pts[0], (x,y), color, 1)

        isClosed = True if len(pts)==4 else False
        cv2.polylines(img_tmp, [np.array(pts)], isClosed, (0,0,255), 1)
        cv2.imshow("image", img_tmp)

def main():
    img_origin = cv2.imread("sf5.jpg ")
    pic = cv2.imread("macho.jpg ")
    pts = []
    cv2.imshow("image", img_origin)
    cv2.setMouseCallback("image", draw_quadrangle, param={"img":img_origin, "pts":pts, "pic":pic})


if __name__ == "__main__":


Fujiko F Fujio Museum

It is a well-known fact that the window frame of the Fujiko F Fujio Museum is designing the manuscript of Doraemon Episode 1 "From the Land of the Future", but check how faithfully it is reproduced. saw.

Exterior of the museum (from Google Street View)
Synthesis result

Well, it's not bad, but it's a shame that the first page is less reproducible.


An international conference that is noisy after being declared war by an evil organization

aybabtu.jpg All your base are belong to us! The original image is here.

Play Darius on the screen of an outdoor movie

darius.jpg The original image is here. I wanted to make "Play Road Runner with Sony Jumbotron at Tsukuba Science Expo", but I stopped because There was a similar story on Twitter.

Play Darius on the Yamanote Line triple signage

The original image"Traffic advertising navigation Synthesis result
train_ad.jpg train_darius.png

It seems difficult to play. As a practical example, I came up with an example of restoring a super long poster of a station that had to be taken from an angle to a rectangle, but I could not find a suitable image.

At the end

If you combine this with rectangle detection, you don't even have to select a rectangle. You still need to know the original image size (aspect ratio).

Reference article

-OpenCV-Python Exercise
By passing the dictionary to the param of the callback function, the change of the dictionary in the callback function will be reflected to the caller as well. This is used like a global variable.
I found this content for students at the West Exit Laboratory of Osaka Institute of Technology, but I wonder if outsiders can see it (when I try to access the exercise from the top page, I'm asked for my user name and password).

Recommended Posts

Transform the image by projective transformation-Hack the monitor screen-
Upload the image downloaded by requests directly to S3
[Python] Specify the range from the image by dragging the mouse
Get the image of "Suzu Hirose" by Google image search.
Save the graph drawn by pyqtgraph to an image