Similar image detection is one of the most commonly used features in image recognition. Recommendation systems and search systems often use tens of thousands or hundreds of thousands of images. Depending on the size of the image and the comparison method, searching for a similar image from among thousands or tens of thousands requires a huge amount of processing time. Therefore, we will consider a method of detecting similar images by reducing the amount of data and the number of comparisons using k-means and PCA.
Face Recognition Face features use face_landmark, which is represented by a 128-dimensional vector and can be implemented in the library at the following URL. https://github.com/ageitgey/face_recognition
The number of dimensions after PCA was set to 20 while observing the contribution rate. After performing PCA and reducing the dimension, it is classified into clusters with K = 10 by k-means. The closest one is calculated from the center of gravity of each cluster, and the distance is calculated only for the images classified into the cluster with the closest center of gravity to detect similar images. It is also effective in reducing the storage capacity by saving the features of the image as data reduced by PCA.
When using 1000 images, clustering by the k-means method makes it possible to compare 100 times + 10 times (comparison with the center of gravity of each cluster) on average. In addition, the number of dimensions of each vector has been reduced from 128 dimensions to 20 dimensions by PCA, so the amount of calculation can be effectively reduced.
http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html This time, the sample source using this free face image is shown below.
# coding:utf-8
import dlib
from imutils import face_utils
import cv2
import glob
import face_recognition
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
from matplotlib import pyplot as plt
import numpy as np
# --------------------------------
# 1.Preparation for face landmark detection
# --------------------------------
#Calling face detection tool
face_detector = dlib.get_frontal_face_detector()
#Calling a face landmark detection tool
predictor_path = 'shape_predictor_68_face_landmarks.dat'
face_predictor = dlib.shape_predictor(predictor_path)
images = glob.glob('./faces/*.jpg')
images = sorted(images)[:100]
face_landmarks = []
face_filepaths = []
for filepath in images:
#Calling in the image to be detected
img = face_recognition.load_image_file(filepath)
face_encodings = face_recognition.face_encodings(img)
if (len(face_encodings)>0):
face_filepaths.append(filepath)
face_landmarks.append(face_encodings[0])
pca = PCA(n_components=20)
pca.fit(face_landmarks)
#Convert the dataset to principal components based on the analysis results
transformed = pca.fit_transform(face_landmarks)
#Plot the principal components
# plt.subplot(1, 2, 2)
plt.scatter(transformed[:, 0], transformed[:, 1])
plt.title('principal component')
plt.xlabel('pc1')
plt.ylabel('pc2')
#Output the contribution rate for each dimension of the main component
print(pca.explained_variance_ratio_)
print(sum(pca.explained_variance_ratio_))
# print(transformed[0])
# print(len(transformed[0]))
#Start Kmeans
#Number of clusters
K = 8
cls = KMeans(n_clusters = 8)
pred = cls.fit_predict(transformed)
#Each element is colored and displayed for each label
for i in range(K):
labels = transformed[pred == i]
plt.scatter(labels[:, 0], labels[:, 1])
#Cluster Centroid(Center of gravity)Draw
centers = cls.cluster_centers_
plt.scatter(centers[:, 0], centers[:, 1], s=100,
facecolors='none', edgecolors='black')
#Find which center of gravity is closest to you
min_center_distance = -1
min_center_k = 0
#Find which center of gravity is farthest
max_center_distance = -1
max_center_k = 0
for center_index in range(K):
distance = np.linalg.norm(transformed[0] - centers[center_index])
if ( distance < min_center_distance or min_center_distance == -1):
min_center_distance = distance
min_center_k = center_index
if ( distance > max_center_distance or max_center_distance == -1):
max_center_distance = distance
max_center_k = center_index
#Show image names of the closest and farthest clusters
print('=========== NEAREST ==============')
for i in range(len(pred)):
if ( min_center_k == pred[i] ):
print(face_filepaths[i])
print('=========== FARTHEST ==============')
for i in range(len(pred)):
if ( max_center_k == pred[i] ):
print(face_filepaths[i])
print('=========================')
#Display the graph
plt.show()
#* Below this is a snake leg
#Calculate the direct distance to each image
distance = {}
for index in range(len(transformed)):
distance[face_filepaths[index]] = np.linalg.norm(transformed[0] - transformed[index])
#Sorted and displayed in order of distance
print(sorted(distance.items(), key=lambda x:x[1]))
The center of gravity is displayed in color-coded features of the image divided into each cluster by a hollow circle. It is a little difficult to understand because the 20-dimensional graph is plotted in two dimensions, but you can see that the main components that are close to each other are clustered together.
Image based on analysis
1.jpg |
---|
Images contained in the same cluster
10.jpg | 11.jpg | 19.jpg | 24.jpg |
---|---|---|---|
Image contained in the cluster with the farthest center of gravity
12.jpg | 37.jpg | 51.jpg | 60.jpg |
---|---|---|---|
Many of the images divided into the same cluster were long-haired women, and many of the images divided into the farthest clusters were short-haired men, so I think that clustering similar to human senses was possible. If you want to get a more rigorous image, you should calculate the norm directly with all images without principal component analysis, but it seems to be used this time to aim for serendipity and realize faster calculation. It seems good to consider the method.
Recommended Posts