[Python] Automatically and seamlessly combine cropped images

Thing you want to do

The two cropped images are ** automatically ** combined so that the seams are natural. 画像結合_イメージ_re.jpg


Starting from the edge of the upper image and the lower image, the area is gradually narrowed and compared. The area with the highest degree of matching is combined as a common area. 画像結合_アルゴイメージ_re.jpg


--Read in pillow format --Join vertically --Horizontal positions are aligned in advance ――The width is also the same

Flow (what I studied)

    1. Image cropping
  1. Compare images
    1. Concatenate images
  2. To reduce the amount of calculation

1. 1. Image cropping

This time, I used ʻImage.crop ()` to trim the pillow format.

Image.crop((X coordinate of the left end of trimming,Top y coordinate,Rightmost x coordinate,Bottom y coordinate))

Note that the argument must be specified as a tuple, and ** parentheses "(" ")" are duplicated **. In order to reduce the amount of calculation as much as possible, the smaller one of the upper side and the lower side is set as the comparison start area, and it is reduced by 1px. (In the example code, if the upper image is large, the difference in height is stored in the variable dif_h and used to specify the trimming top coordinates.)

2. Compare images

The cropped image is converted to opencv format and the images are compared using cv2.matchTemplate ().

cv2.matchTemplate(Image to compare 1,Image 2 to compare,Matching mode)

There seem to be various matching modes, but I think that anything is OK (with exceptions) for this application, so I used the standard-like cv2.TM_CCOEFF_NORMED. Since the output of matchTemplate is a grayscale image (likely) that shows the degree of matching in light and dark, the maximum value = the matching evaluation result of this image is obtained using cv2.minMaxLoc ().

cv2.minMaxLoc(Grayscale image to evaluate)[1]

As for the return value of minmaxLoc, the maximum value is stored in the second, so the maximum value is taken out with [1]. (Details of image template matching can be found on this page Is explained in)

Repeat the above with a for statement while changing the area to find the place where the matching evaluation result is maximized.

3. 3. Concatenate images

I quoted from this page. https://note.nkmk.me/python-pillow-concat-images/

def get_concat_v(im1, im2):
    dst = Image.new('RGB', (im1.width, im1.height + im2.height))
    dst.paste(im1, (0, 0))
    dst.paste(im2, (0, im1.height))
    return dst

In the pillow format image, create an area with ʻImage.new () and paste the image to be connected with ʻImage.paste ().

4. To reduce the amount of calculation

As you can see, ** The larger the image size, the longer it takes **. This time, it is grayscaled, and the smaller one of the upper image and the lower image is set as the comparison start area size. Although not adopted this time, I think that the amount of calculation can be reduced by resizing and comparing images, limiting the area to be compared, and ending the comparison when the evaluation value exceeds a certain value.

Completed code

import numpy as np
from PIL import Image
import cv2

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
def auto_img_concat_v(upper_img, under_img):

    max_res_num = 0  #Save maximum value of matching evaluation result
    over_pixel = 0  #Save the number of overlapping pixels at the maximum matching evaluation

    img_w = upper_img.width
    upper_h = upper_img.height
    under_h = under_img.height

    compare_height = min(upper_h, under_h)  #Number of pixels to compare (height)

    #If the height of the above image is high, get the difference (otherwise 0)
    dif_h = (upper_h - under_h) if upper_h > under_h else 0

    for i in range(1, compare_height):

        search_img = upper_img.crop((0, i + dif_h, img_w, upper_h))  #Crop the above image
        target_img = under_img.crop((0, 0, img_w, compare_height - i))  #Crop the image below

        search_np = np.array(search_img, dtype=np.uint8)  #Convert image to array (upper image)
        cv_search_img = cv2.cvtColor(search_np, cv2.COLOR_RGB2GRAY)  #Grayscale (upper image)

        target_np = np.array(target_img, dtype=np.uint8)  #〃 (below image)
        cv_target_img = cv2.cvtColor(target_np, cv2.COLOR_RGB2GRAY)  #〃 (below image)

        res = cv2.matchTemplate(
            cv_search_img, cv_target_img, cv2.TM_CCOEFF_NORMED)  #Matching evaluation (output is a grayscale image showing similarity)
        res_num = cv2.minMaxLoc(res)[1]  #Get matching evaluation numerically
        print(res_num, "\n", i)

        if max_res_num < res_num:  #Get the maximum value of matching evaluation result
            max_res_num = res_num
            over_pixel = target_img.height  #Get the number of overlapping pixels at the maximum matching evaluation value

    print("\n", max_res_num, "\n", over_pixel)

    if max_res_num > 0.98:  #Join processing if the evaluation value is above a certain level
        result_img = get_concat_v(upper_img.crop(
            (0, 0, img_w, upper_h - over_pixel)), under_img)  #Image combination
        return result_img
        print("Image combination failed")
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
#Function to concatenate images (vertical direction)
def get_concat_v(im1, im2):
    dst = Image.new('RGB', (im1.width, im1.height + im2.height))
    dst.paste(im1, (0, 0))
    dst.paste(im2, (0, im1.height))
    return dst

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
if __name__ == "__main__":
    upper = Image.open(r"C:\Users\aaa\Desktop\Danbo_upper.jpg ")
    under = Image.open(r"C:\Users\aaa\Desktop\Danbo_under.jpg ")
    concat_img = auto_img_concat_v(upper, under)

    if concat_img:


結果re.jpg I was able to combine well.

Future plans

There are problems such as a large amount of calculation and not supporting images whose horizontal directions do not match, but I was able to do what I wanted to do for the time being. In the future, I would like to create a tool that automatically connects scrolled screens in combination with pyautog.

Recommended Posts

[Python] Automatically and seamlessly combine cropped images
Capturing images with Pupil, python and OpenCV
Automatically generate images of koalas and bears
Importing and exporting GeoTiff images with Python
Automatically search and download YouTube videos with Python
Automatically create word and excel reports in python
Split Python images and arrange them side by side
Automatically paste images into PowerPoint materials with python + α
Automatically translate DeepL into English with Python and Selenium
[python] Compress and decompress
Python and numpy tips
[Python] pip and wheel
Batch design and python
Python iterators and generators
Python packages and modules
Vue-Cli and Python integration
Automatically execute python file
Ruby, Python and map
python input and output
Python and Ruby split
Python3, venv and Ansible
Python asyncio and ContextVar
I tried [scraping] fashion images and text sentences in Python.
Send experiment results (text and images) to slack with Python
[Introduction to Python] Combine Nikkei 225 and NY Dow csv data
How to log in to AtCoder with Python and submit automatically
Notes on reading and writing float32 TIFF images in python