Bad post for using "animeface-2009" in Python & Implementation of function to output to PASCAL VOC format XML file

animeface2009_and_python.png

If you want to detect a character's face from an anime image in Python, use lbpcascade_animeface implemented by nagadomi. I often see cases. This lbpcascade_animeface is a module that estimates the bounding box of a face.

In addition, the module animeface-2009 implemented by nagadomi mentioned above is not only for the face, but also for facial parts (eyes, nose, mouth, etc.) The bounding box of the chin) is also detectable [^ 1]. However, only Ruby provides a binding level API here. According to the README.md of lbpcascade_animeface, it is written that" the detection accuracy is higher in ʻanimeface-2009`". Therefore, I definitely want to use it in Python.

[^ 1]: However, the bounding box on the nose and chin is 1x1 pixels, so it might be better to call it a landmark.

Function to call "animeface-2009" from Python

The Git repository of the one implemented this time is here.

Rewriting C code implemented for Ruby for Python is not an easy task for a fucking zaco engineer like me. So, although it's a bad idea, the Python subprocess module starts Ruby via the shell and executes the Ruby script of ʻanimeface-2009` [Python function](https://github.com/meow- noisy / animeface_result2xml / blob / master / animeface_poor_caller.py) has been implemented. Of course, ** this code alone cannot detect faces **. Advance preparation is required. You need to structure the directory like the Git repository mentioned above and also build animeface-2009 [^ 2] ..

[^ 2]: The original ʻanimeface-2009` was difficult to build in multiple environments, so I made a change to the [fork repository](https://github.com/meow-noisy/animeface-2009/ I am using tree / a39361157ba8bf16ccd838942fa2f333658e9627). We have added the necessary steps at build time to README.md in this repository, so please check it as well.

# a poor python caller of animeface-2009
# call rb module via shell by using subprocess...

import subprocess

import sys
import json
from pathlib import Path

this_file_dir = (Path(__file__).resolve()).parent


def detect_animeface(im_path):

    im_path = Path(im_path).resolve()
    assert im_path.exists()

    ruby_script_path = this_file_dir / 'animeface-2009/animeface-ruby/sample.rb'
    ret = subprocess.check_output(["ruby", str(ruby_script_path), str(im_path)]).decode('utf-8')

    ret = ret.replace("=>", ":")
    ret = ret.replace(">", "\"")
    ret = ret.replace("#", "\"")
    list_ = json.loads(ret)

    return list_

We are not passing data directly from Ruby to Python. After outputting the detected bounding box information as standard, Python manages to format the character string so that it is in JSON format and makes it a dictionary type [^ 4]. Although it is a function interface, the input is the image path, the return value is the detection result, and the following dictionary list is returned. One face detection result is one dictionary.

[^ 4]: Some unnecessary elements (character strings representing Ruby objects) are included, but since post-processing was troublesome, I am trying to output it as it is.

[{'chin': {'x': 228, 'y': 266},
  'eyes': {'left': {'colors': ['<Magick::Pixel:0x00007ff93e969148',
                               '<Magick::Pixel:0x00007ff93e968e28',
                               '<Magick::Pixel:0x00007ff93e968d88',
                               '<Magick::Pixel:0x00007ff93e968bf8'],
                    'height': 31,
                    'width': 39,
                    'x': 222,
                    'y': 181},
           'right': {'colors': ['<Magick::Pixel:0x00007ff93e968040',
                                '<Magick::Pixel:0x00007ff93e968018',
                                '<Magick::Pixel:0x00007ff93e968180',
                                '<Magick::Pixel:0x00007ff93e9681a8'],
                     'height': 28,
                     'width': 31,
                     'x': 165,
                     'y': 202}},
  'face': {'height': 127, 'width': 127, 'x': 158, 'y': 158},
  'hair_color': '<Magick::Pixel:0x00007ff93e969cb0',
  'likelihood': 1.0,
  'mouth': {'height': 12, 'width': 25, 'x': 210, 'y': 243},
  'nose': {'x': 207, 'y': 233},
  'skin_color': '<Magick::Pixel:0x00007ff93e96a020'},
 {'chin': {'x': 379, 'y': 243},
  'eyes': {'left': {'colors': ['<Magick::Pixel:0x00007ff93e96b6a0',
                               '<Magick::Pixel:0x00007ff93e96b8d0',
                               '<Magick::Pixel:0x00007ff93e96b9e8',
                               '<Magick::Pixel:0x00007ff93e96bab0'],
                    'height': 29,
                    'width': 32,
                    'x': 418,
                    'y': 177},
           'right': {'colors': ['<Magick::Pixel:0x00007ff93e963568',
                                '<Magick::Pixel:0x00007ff93e963478',
                                '<Magick::Pixel:0x00007ff93e963298',
                                '<Magick::Pixel:0x00007ff93e9631a8'],
                     'height': 31,
                     'width': 39,
                     'x': 354,
                     'y': 157}},
  'face': {'height': 139, 'width': 139, 'x': 329, 'y': 121},
  'hair_color': '<Magick::Pixel:0x00007ff93e96ab38',
  'likelihood': 1.0,
  'mouth': {'height': 12, 'width': 20, 'x': 383, 'y': 218},
  'nose': {'x': 401, 'y': 205},
  'skin_color': '<Magick::Pixel:0x00007ff93e96a7c8'}]

By the way, when the Python file is executed as a script, the image that depicts the detection result is output.

result_image.png

Module that outputs the detection result to an XML file in PASCAL VOC format

You can use the above function to detect animated faces (parts), but not all faces can be detected. From the perspective of detection performance scalability, we will also consider migrating to DNN-based detectors. Therefore, using the detection results of animeface-2009, we implemented a function to create an object detection dataset for learning DNN. The data set format is the standard PASCAL VOC.

$ python animeface_result2xml.py [directory of the image to be detected] [directory to output the XML file] [text file path of the created XML file list]

Collect the images you want to detect in the directory in advance (subdirectories can be configured), specify the path to the directory to output the XML file, and the text file showing the list of created files.

By the way, the XML file is as follows.

<annotation>
    <folder>folder_name</folder>
    <filename>img_file_path</filename>
    <path>/path/to/dummy</path>
    <source>
    <database>Unknown</database>
    </source>
    <size>
        <width>600</width>
        <height>600</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>face</name>
        <pose>Unspecified</pose>
        <truncated>1</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>158</xmin>
            <ymin>158</ymin>
            <xmax>285</xmax>
            <ymax>285</ymax>
        </bndbox>
    </object><object>
        <name>right_eye</name>
        <pose>Unspecified</pose>
        <truncated>1</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>165</xmin>
            <ymin>202</ymin>
            <xmax>196</xmax>
            <ymax>230</ymax>
        </bndbox>
...

Checking labels with tools

When editing an XML file, it is recommended to use the annotation tool LabelImg. You can check the estimated bounding box graphically, and correct the wrong place on the spot.

labelimg.png

After that, the created XML is parsed by the data loader of the deep learning framework, and the object detection DNN model is trained.

in conclusion

I have implemented a dirty function to manage ʻanimeface-2009`, which is a Ruby module that detects anime face parts, in Python. We also implemented a function to convert the detection result into an XML file in Pascal VOC format. I would like to use this to expand the range of applications of anime machine learning.

In this article, we took the position that annotation labels are required to detect animated objects with DNN, but some people may find it a little troublesome. Recently, kanosawa published an article using the Domain Adaptation technique for learning object detection models. This allows you to learn the object detection model (Faster RCNN) without annotating the bounding box of the anime face, so please check it if you are interested. -I tried to realize high-precision anime face detection without annotation (Domain Adaptation)

Recommended Posts

Bad post for using "animeface-2009" in Python & Implementation of function to output to PASCAL VOC format XML file
Convert Pascal VOC format xml file to COCO format json file
Output the specified table of Oracle database in Python to Excel for each file
Convert XML document stored in XML database (BaseX) to CSV format (using Python)
Speed evaluation of CSV file output in Python
Change the standard output destination to a file in Python
To return char * in a callback function using ctypes in Python
A memo of writing a basic function in Python using recursion
Google search for the last line of the file in Python
Convert Excel file to text in Python for diff purposes
Post to Twitter using Python
Implementation of quicksort in Python
Post to Slack in Python
Output the contents of ~ .xlsx in the folder to HTML with Python
Output search results of posts to a file using Mattermost API
Implementation of login function in Django
Output to csv file with Python
Implementation of life game in Python
Implementation of desktop notifications using Python
Notes for Python file input / output
Implementation of original sorting in Python
The file name was bad in Python and I was addicted to import
Implementation of particle filters in Python and application to state space models
How to format a list of dictionaries (or instances) well in Python
I made a program to check the size of a file in Python
[Python] [Word] [python-docx] Try to create a template of a word sentence in Python using python-docx