OpenCV-Python detects similar image diffs

Thing you want to do

For example, checking the appearance of OFFICE documents converted to PDF with a different typesetting engine. I want an image that can be used as evidence and can be understood by people. Also, it's difficult to see all the images, so I'll give a numerical value to the degree of difference, and people will check only the ones with the big difference.

What to prepare

--Environment where Python3 runs

It's unofficial, but there is an OpenCV Python environment, so I'll put it in quickly. pip install opencv-python The dependent numpy is also included.

https://pypi.org/project/opencv-python/

Python code

diff_img.py


import pathlib
import cv2
import numpy as np

source_dir = pathlib.Path('source_img')
source_files = source_dir.glob('*.*')
target_dir = pathlib.Path('target_img')
result_dir = pathlib.Path('result_img')
log_file = result_dir / pathlib.Path('result.log')
kernel = np.ones((3, 3), np.uint8)

fs = open(log_file, mode='w')
for source_file in source_files:
    source_img = cv2.imread(str(source_file))
    target_file = target_dir / source_file.name
    target_img = cv2.imread(str(target_file))
    if target_img is None:
        fs.write(target_file + '...skipped.\n')
        continue
    max_hight = max(source_img.shape[0], target_img.shape[0])
    max_width = max(source_img.shape[1], target_img.shape[1])

    temp_img = source_img
    source_img = np.zeros((max_hight, max_width, 3), dtype=np.uint8)
    source_img[0:temp_img.shape[0], 0:temp_img.shape[1]] = temp_img

    temp_img = target_img
    target_img = np.zeros((max_hight, max_width, 3), dtype=np.uint8)
    target_img[0:temp_img.shape[0], 0:temp_img.shape[1]] = temp_img

    result_img = cv2.addWeighted(source_img, 0.5, target_img, 0.5, 0)

    source_img = cv2.cvtColor(source_img, cv2.COLOR_BGR2GRAY)
    target_img = cv2.cvtColor(target_img, cv2.COLOR_BGR2GRAY)
    img = cv2.absdiff(source_img, target_img)
    rtn, img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU)
    img = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)

    contours, hierarchy = cv2.findContours(img, cv2.RETR_TREE,
                                           cv2.CHAIN_APPROX_SIMPLE)
    result_img = cv2.drawContours(result_img, contours, -1, (0, 0, 255))
    score = 0
    for contour in contours:
        score += cv2.contourArea(contour)
    score /= max_hight * max_width
    fs.write(target_file.name + ', ' + str(score) + '\n')
    diff_file = result_dir / source_file.name
    cv2.imwrite(str(diff_file), result_img)
fs.close()

For the comparison method, see this article [It was too difficult to find a mistake in Saizeriya, so I solved it with the help of an adult] (http://kawalabo.blogspot.com/2014/11/blog-post.html) http://kawalabo.blogspot.com/2014/11/blog-post.html Is referred to.

The score is simply calculated by taking out the area of the area where the difference was detected, dividing it by the area of the entire image, and spitting it out in the log together with the file name.

result.log


test-1.png, 0.01231201710816777
test-2.png, 0.0084626793598234

Added on January 7, 2020

Enabled to compare even if the size of the image to be compared is different.

Execution example

test-1.png Although it is a similar table, the comparison target has a slightly smaller width and a larger margin.

Recommended Posts

OpenCV-Python detects similar image diffs
opencv-python Introduction to image processing