[PYTHON] Use dHash to locate on the course from a scene in a racing game

1.First of all

One of the similar image search algorithms is dHash. For the contents of the algorithm, see "Calculate the similarity of images using Perceptual Hash". Is easy to understand, but I understand that it is a similar image search algorithm with the following characteristics.

So, I tried using this dHash to see if I could identify the position of the scene on the course from a scene (onboard video) of the play of the racing game Assetto Corsa.

Specifically, the flow is as follows.

概要.png

① First, all frames are extracted and saved as PNG images from a play video (onboard video) that goes around the course.

② Calculates the dHash hash value for all saved frame images. In addition, since it is possible to identify the position of the vehicle at a certain time from the telemetry data acquired when shooting the play video, the hash value calculated in combination with the position information is saved in the search CSV file.

③ On the other hand, select one scene where you want to identify the position on the course from another play video.

④ Calculates the dHash hash value for the selected one-scene image.

⑤ Search the CSV file for search for the image with the hash value closest to the calculated hash value of dHash. Since the position information is linked to the image hit by the search in (2), that position is regarded as the position on the course of the selected scene.

In one scene of a racing game, even if you are close to the course, there will be a slight misalignment due to differences in the lines that pass through each play. I think the point of this time is to be able to absorb such differences and search for similar images.

2. Implementation code

This time, we will implement the above process in Python.

2-1. Extracting frame images from play videos

I used OpenCV this time to extract the images of all frames from the play video file (mp4 file). I refer to the following page.

The frame image is saved with the file name "(frame number) .png ".

01_extract_frames.py


import cv2
import sys

def extract_frame(video_file, save_dir):
    capture = cv2.VideoCapture(video_file)

    frame_no = 0

    while True:
        retval, frame = capture.read()

        if retval:
            cv2.imwrite(r'{}\{:06d}.png'.format(save_dir, frame_no), frame)
            frame_no = frame_no + 1
        else:
            break

if __name__ == '__main__':
    video_file = sys.argv[1]
    save_dir = sys.argv[2]

    extract_frame(video_file, save_dir)

This script is also used in ③.

2-2. Linking dHash calculation and location information

Of the extracted frame images, the hash value of dHash is calculated using the dhash function of the ImageHash package for the image corresponding to a certain period (a specific lap part). In addition, it is linked with the telemetry data (extracting only the relevant part in advance) acquired by the following in-game App script and output to the search CSV file.

02_calc_dhash.py


from PIL import Image, ImageFilter
import imagehash
import csv
import sys

frame_width = 1280
frame_hight = 720
trim_lr = 140
trim_tb = 100

dhash_size = 8

def calc_dhash(frame_dir, frame_no_from, frame_no_to, telemetry_file, output_file):

    #Read telemetry data file
    position_data = [row for row in csv.reader(open(telemetry_file), delimiter = '\t')]

    writer = csv.writer(open(output_file, mode = 'w', newline=''))

    for i in range(frame_no_from, frame_no_to + 1):
        #Read the extracted frame image and crop it (to erase the time etc. displayed at the edge of the image)
        frame = Image.open(r'{}\{:06d}.png'.format(frame_dir, i))
        trimed_frame = frame.crop((
            trim_lr, 
            trim_tb, 
            frame_width - trim_lr, 
            frame_hight - trim_tb))

        #Calculation of dHash value
        dhash_value = str(imagehash.dhash(trimed_frame, hash_size = dhash_size))
        
        #Linking with telemetry data
        #Since both images and telemetry are output at regular intervals, they are simply linked in proportion to the number of lines.
        position_no = round((len(position_data) - 1) * (i - frame_no_from) / (frame_no_to - frame_no_from)) 

        writer.writerow([
                i,
                dhash_value,
                position_data[position_no][9], 
                position_data[position_no][10]
        ])

if __name__ == '__main__':
    frame_dir = sys.argv[1]
    frame_no_from = int(sys.argv[2])
    frame_no_to = int(sys.argv[3])
    telemetry_file = sys.argv[4]
    output_file = sys.argv[5]
    
    calc_dhash(frame_dir, frame_no_from, frame_no_to, telemetry_file, output_file)

As a result of this script, the following information (frame number), (hash value), and (2D coordinate position in meters) are output to the CSV file.

731,070b126ee741c080,-520.11,139.89
732,070b126ee7c1c080,-520.47,139.90
733,070b126ee7c1c480,-520.84,139.92

This script is also used in ④.

2-3. Search for the closest hash value

The image with the hash value closest to the specified hash value is searched from the search CSV file output in 2-2.

The Hamming distance is used for the closeness of hash values. I use popcount from the gmpy2 package to calculate the Hamming distance (because it seems to be very fast).

03_match_frame.py


import csv
import gmpy2
import sys

def match_frame(base_file, search_hash):

    base_data = [row for row in csv.reader(open(base_file))]

    min_distance = 64
    min_line = None

    results = []

    for base_line in base_data:
        distance = gmpy2.popcount(
            int(base_line[1], 16) ^
            int(search_hash, 16)
        )

        if distance < min_distance:
            min_distance = distance
            results = [base_line]
        elif distance == min_distance:
            results.append(base_line)
    
    print("Distance = {}".format(min_distance))
    for min_line in results:
        print(min_line)

if __name__ == '__main__':
    base_file = sys.argv[1]
    search_hash = sys.argv[2]

    match_frame(base_file, search_hash)

As shown below, the location information associated with the image information with the closest hash value is output. (If there are multiple images with the same distance, all image information will be displayed.)

> python.exe 03_match_frame.py dhash_TOYOTA_86.csv cdc9cebc688f3f47
Distance = 8
['13330', 'c9cb4cb8688f3f7f', '-1415.73', '-58.39']
['13331', 'c9eb4cbc688f3f7f', '-1415.39', '-58.44']

3. Search results

This time, all frames of the play video of the Nürburgring Nordschleife (total length 20.81km) run on the TOYOTA 86 GT are extracted ⇒ Hash value calculation is performed, and it is close to the image of some scenes of another play video run on the BMW Z4. I will look for the image.

First of all, let's check if the images of the three famous corners can be searched properly.

Image used for search Hit image
image BMW_005666.png 86_008031.png
Hash value ced06061edcf9f2d 0c90e064ed8f1f3d
location information (-2388.29, 69.74) (-2416.50, 66.67)

Hash value distance = 10, location information deviation = 28.4m

In dHash, it is said that if the hash value distance is 10 or less, it is regarded as the same image, but it is just barely. Even on the image, the shape of the corners is similar, but the positions of the surrounding trees are slightly different, so it is difficult to judge whether they are similar or not.

Image used for search Hit image
image BMW_014187.png 86_017358.png
Hash value 7c5450640c73198c 7c7c50642d361b0a
location information (317.58, -121.52) (316.18, -121.45)

Hash value distance = 11, location information deviation = 1.4m

This is pretty close in terms of images. However, the hash value distance is 11 which is larger than before.

Image used for search Hit image
image BMW_018404.png 86_022388.png
Hash value 665d1d056078cde6 665c1d856050da8d
location information (2071.48, 77.01) (2071.23, 77.12)

Hash value distance = 13, location information deviation = 0.27m

The position of the vehicle is also very close, and the images look quite similar, but if you look closely, the orientation is slightly different. The hash value distance is 13 which is quite large.

As mentioned above, I tried it in three famous corners, but it seems that I can identify the closest position.

In addition, I tried it in 10 randomly selected scenes, and it became as follows.

Only one case where only the wrong position was hit is shown below.

Image used for search Hit image
image BMW_016175.png 86_020877.png
Hash value b7b630b24c1e1f1e b7b43839481e3f1f
location information (1439.61, -18.69) (2059.41, 37.44)

Hash value distance = 9, location information deviation = 622.34m

It seems that the way the trees grow on the left and right and the tip of the course are quite different from the left curve or straight, but the hash value distance is relatively close to 9.

By the way, the images that I would like you to hit are as follows.

Image used for search Hit image
BMW_016175.png 86_019818_Correctanswer.png

Hash value difference = 13

At first glance, the images look similar, but if you look closely, they are out of alignment, and as a result, I think the difference is relatively large.

4. Finally

In this article, I searched for similar images using dHash for racing game scenes.

The accuracy is relatively good for characteristic scenes such as famous corners, but when I randomly selected the scenes, the winning percentage was 70% (please forgive the fact that the number of confirmed cases is as small as 10).

It is not a solid interpretation, but my personal impression is as follows.

If you want to improve the accuracy, I think you can think of the following measures.

Recommended Posts

Use dHash to locate on the course from a scene in a racing game
A memorandum on how to use keras.preprocessing.image in Keras
How to use the __call__ method in a Python class
Notes on how to use marshmallow in the schema library
A game to go on an adventure in python interactive mode
[Introduction to Python] How to use the in operator in a for statement?
About the error I encountered when trying to use Adafruit_DHT from Python on a Raspberry Pi
Connect to postgreSQL from Python and use stored procedures in a loop.
Use a shortcut to enable or disable the touchpad in Linux Mint
A memorandum because I stumbled on trying to use MeCab in Python
From nothing on Ubuntu 18.04 to setting up a Deep Learning environment in Tensor
I want to use Python in the environment of pyenv + pipenv on Windows 10
Script to use multiple github accounts properly in the same repository on the same machine
Use libsixel to output Sixel in Python and output a Matplotlib graph to the terminal.
Use PIL in Python to extract only the data you want from Exif
How to use the C library in Python
Use the latest pip in a virtualenv environment
Use pygogo to get the log in json.
How to set a shared folder with the host OS in CentOS7 on VirtualBOX
It was a life I wanted to OCR on AWS Lambda to locate the characters.
Use slackbot as a relay and return from bottle to slack in json format.
I wanted to use the Python library from MATLAB
How to use the exists clause in Django's queryset
Convenient to use matplotlib subplots in a for statement
Write a log-scale histogram on the x-axis in python
How to use the model learned in Lobe in Python
How to post a ticket from the Shogun API
I want to use the R dataset in python
[C / C ++] Pass the value calculated in C / C ++ to a python function to execute the process, and use that value in C / C ++.
2015-12-26 python2> datetime> Implementation to take the difference in seconds from two ISO format datetime strings> Use .seconds ()
How to quickly count the frequency of appearance of characters from a character string in Python?
Use Heroku in python to notify Slack when a specific word is muttered on Twitter
A story about trying to use cron on a Raspberry Pi and getting stuck in space
Use bubble sort to generate random numbers based on a standard normal distribution from uniform random numbers
How to plot the distribution of bacterial composition from Qiime2 analysis data in a box plot
I wrote a doctest in "I tried to simulate the probability of a bingo game with Python"
How to use GitHub on a multi-person server without a password
Define a task to set the fabric env in YAML
[Part 4] Use Deep Learning to forecast the weather from weather images
How to use Fujifilm X-T3 as a webcam on Ubuntu 20.04
A note on the default behavior of collate_fn in PyTorch
[Part 1] Use Deep Learning to forecast the weather from weather images
How to slice a block multiple array from a multiple array in Python
Log in to the fortigate (6.0) management screen from selenium-try to log out
Survey on the use of machine learning in real services
Change the standard output destination to a file in Python
[Hyperledger Iroha] Notes on how to use the Python SDK
How to make a multiplayer online action game on Slack
How to use VS Code in venv environment on windows
Use MeCab to translate sloppy sentences in a "slow" way.
Put the lists together in pandas to make a DataFrame
How to log in automatically like 1Password from the CLI
A note on how to load a virtual environment in PyCharm
How to generate a query using the IN operator in Django
How to get the last (last) value in a list in Python
[Part 2] Use Deep Learning to forecast the weather from weather images
To write a test in Go, first design the interface
[Python] The first step to making a game with Pyxel
I tried changing the python script from 2.7.11 to 3.6.0 on windows10
Extract the value closest to a value from a Python list element
How to take a screenshot of the Chrome screen (prevent it from cutting off in the middle)