Find image similarity with Python + OpenCV

Hello everyone. @best_not_best. My current job is to implement a job recommendation function for a job site and analyze the site using Google Analytics and Adobe Analytics. (The rest is just posting animal images to in-house Slack.)

In Last Advent Calendar, image similarity judgment was performed using Deep Learning. The accuracy was good, but collecting training data is still a bottleneck ... This time, I would like to simply compare the two images to find the similarity of the images.

things to do

Compare the similarities of the following images of dogs. The image is picked up from Google teacher. To improve accuracy, I chose an image that faces the front.

Comparison image

file name image Description
05.png 05 Shiba Inu (cute)

Comparison

file name image Description
01.png 01 Dachshund (cute)
02.png 02 Corgi (cute)
03.png 03 Golden retriever (cute)
04.png 04 Shiba Inu (cute)
06.png 06 Labrador retriever (cute)

If the similarity between 05.png and 04.png, which are the same breed, is high, it is a success.

environment

Directory structure

hist_matching.py
feature_detection.py
images
 ├─ 01.png
 ├─ 02.png
 ├─ 03.png
 ├─ 04.png
 ├─ 05.png
 └─ 06.png

Validation 1: Histogram comparison

Roughly speaking, it is a method of comparing by color. See below for details.

Grayscale conversion is not performed because the comparison is based on the hue. In addition, the image size is uniformly converted to 200px x 200px for comparison.

hist_matching.py


#!/usr/bin/env python
# -*- coding: UTF-8 -*-

"""hist matching."""

import cv2
import os

TARGET_FILE = '05.png'
IMG_DIR = os.path.abspath(os.path.dirname(__file__)) + '/images/'
IMG_SIZE = (200, 200)

target_img_path = IMG_DIR + TARGET_FILE
target_img = cv2.imread(target_img_path)
target_img = cv2.resize(target_img, IMG_SIZE)
target_hist = cv2.calcHist([target_img], [0], None, [256], [0, 256])

print('TARGET_FILE: %s' % (TARGET_FILE))

files = os.listdir(IMG_DIR)
for file in files:
    if file == '.DS_Store' or file == TARGET_FILE:
        continue

    comparing_img_path = IMG_DIR + file
    comparing_img = cv2.imread(comparing_img_path)
    comparing_img = cv2.resize(comparing_img, IMG_SIZE)
    comparing_hist = cv2.calcHist([comparing_img], [0], None, [256], [0, 256])

    ret = cv2.compareHist(target_hist, comparing_hist, 0)
    print(file, ret)

Does it consume memory? Segmentation fault: 11 or python (18114,0x7fff7a45b000) malloc: *** error for object 0x102000e00: incorrect checksum for freed object --object was probably modified after being freed. It is rarely displayed. I couldn't find a solution (sorry), but when dealing with a large number of images, it seems better to reduce the number of comparison images and perform processing multiple times.

Execution result


TARGET_FILE: 05.png
01.png 0.3064316801821619
02.png -0.09702013809004943
03.png 0.5273343981076624
04.png 0.5453261576844468
06.png 0.1256772923432995

For the exact same image, the similarity is 1. You can see that the similarity between 05.png and 04.png is high. It was surprising that the similarity with 01.png was high.

Verification 2: Feature point matching

2 Extract the feature points of the image and compare their distances. I referred to the following.

Grayscale conversion is performed to improve the extraction accuracy. As in the previous section, the image size is uniformly converted to 200px x 200px for comparison.

feature_detection.py


#!/usr/bin/env python
# -*- coding: UTF-8 -*-

"""feature detection."""

import cv2
import os

TARGET_FILE = '05.png'
IMG_DIR = os.path.abspath(os.path.dirname(__file__)) + '/images/'
IMG_SIZE = (200, 200)

target_img_path = IMG_DIR + TARGET_FILE
target_img = cv2.imread(target_img_path, cv2.IMREAD_GRAYSCALE)
target_img = cv2.resize(target_img, IMG_SIZE)

bf = cv2.BFMatcher(cv2.NORM_HAMMING)
# detector = cv2.ORB_create()
detector = cv2.AKAZE_create()
(target_kp, target_des) = detector.detectAndCompute(target_img, None)

print('TARGET_FILE: %s' % (TARGET_FILE))

files = os.listdir(IMG_DIR)
for file in files:
    if file == '.DS_Store' or file == TARGET_FILE:
        continue

    comparing_img_path = IMG_DIR + file
    try:
        comparing_img = cv2.imread(comparing_img_path, cv2.IMREAD_GRAYSCALE)
        comparing_img = cv2.resize(comparing_img, IMG_SIZE)
        (comparing_kp, comparing_des) = detector.detectAndCompute(comparing_img, None)
        matches = bf.match(target_des, comparing_des)
        dist = [m.distance for m in matches]
        ret = sum(dist) / len(dist)
    except cv2.error:
        ret = 100000

    print(file, ret)

In rare cases, I use try except to spit out cv2.error. I tried the extraction method with AKAZE and ORB.

Execution result (AKAZE)


TARGET_FILE: 05.png
01.png 143.925
02.png 134.05
03.png 140.775
04.png 127.8
06.png 148.725

Execution result (ORB)


TARGET_FILE: 05.png
01.png 67.59139784946237
02.png 58.60931899641577
03.png 59.354838709677416
04.png 53.59498207885304
06.png 63.55913978494624

Since the distance is calculated, the value for the exact same image is 0, and the smaller the value, the higher the similarity. In both methods implemented, the similarity between 05.png and 04.png is high as in the previous section.

Summary

Recommended Posts

Find image similarity with Python + OpenCV
Image editing with python OpenCV
[Python] Using OpenCV with Python (Image Filtering)
[Python] Using OpenCV with Python (Image transformation)
Image processing with Python & OpenCV [Tone Curve]
Image acquisition from camera with Python + OpenCV
Light image processing with Python x OpenCV
I tried to make an image similarity function with Python + OpenCV
I tried "smoothing" the image with Python + OpenCV
I tried "differentiating" the image with Python + OpenCV
How to crop an image with Python + OpenCV
[Small story] Test image generation with Python / OpenCV
Image processing with Python (Part 2)
"Apple processing" with OpenCV3 + Python3
Camera capture with Python + OpenCV
[Python] Using OpenCV with Python (Basic)
Sorting image files with Python (2)
Sorting image files with Python (3)
Image processing with Python (Part 1)
Tweet with image in Python
Sorting image files with Python
Image processing with Python (Part 3)
Face detection with Python + OpenCV
Get image features with OpenCV
Using OpenCV with Python @Mac
Image recognition with Keras + OpenCV
[Python] Image processing with scikit-image
[OpenCV / Python] I tried image analysis of cells with OpenCV
JPEG image generation by specifying quality with Python + OpenCV
Create miscellaneous Photoshop videos with Python + OpenCV ② Create still image Photoshop
I tried "gamma correction" of the image with Python + OpenCV
Cut out an image with python
Real-time image processing basics with opencv
Neural network with OpenCV 3 and Python 3
[Python] Using OpenCV with Python (Edge Detection)
Find the Levenshtein Distance with python
Image processing with Python 100 knocks # 3 Binarization
Easy Python + OpenCV programming with Canopy
Paste png with alpha channel as transparent image with Python / OpenCV
Try face recognition with python + OpenCV
Cut out face with Python + OpenCV
Face recognition with camera with opencv3 + python2.7
Load gif images with Python + OpenCV
Try blurring the image with opencv2
Use OpenCV with Python 3 in Window
Image processing with Python 100 knocks # 2 Grayscale
Draw an illustration with Python + OpenCV
Introduction to image analysis opencv python
Track baseball balls with Python + OpenCV
Graph Based Segmentation with Python + OpenCV
Send image with python, save with php
[Python] Easy reading of serial number image files with OpenCV
Draw arrows (vectors) with opencv / python
Basic study of OpenCV with Python
Gradation image generation with Python [1] | np.linspace
Automatic image interpolation with OpenCV and Python (Fast Marching Method, Navier-Stokes)
Image processing with Python 100 knock # 10 median filter
Face detection with Python + OpenCV (rotation invariant)
HTML email with image to send with python
Create a dummy image with Python + PIL.
Save video frame by frame with Python OpenCV