How to process camera images with Teams and Zoom Volume of processing in animation style

`This article is also introduced here. `` https://cloud.flect.co.jp/entry/2020/04/02/114756#f-459f23dc

In the previous and two previous posts, I introduced how to process camera images with Teams and Zoom. These posts show demos that detect smiles and emotions (facial expressions) and display smileys.

https://qiita.com/wok/items/0a7c82c6f97f756bde65 https://qiita.com/wok/items/06bdc6b0a0a3f93eab91

This time, I've expanded it a little more and tried an experiment to convert the image into an animation style and display it, so I will introduce it. First of all, I think that it is difficult to use the CPU because the lag is a little too large to convert it to an animation-like image in real time. ~~ (I haven't tried if it's better with GPU.) ~~ ➔ I tried it on GPU. About 15fps comes out. It was insanely quick.

frame_test_4screen.gif

Then I will introduce it immediately.

Anime style image conversion

It seems that it was featured in the news media about half a year ago, so many of you may know it, but the method of converting photos into anime style is published on the following page.

https://github.com/taki0112/UGATIT

In this UGATIT, unlike the simple image2image style conversion, it seems that it has become possible to respond to shape changes by adding a unique function called AdaLIN based on the so-called GAN technology that uses Generator and Discriminator. is.

In our work, we propose an Adaptive Layer-Instance Normalization (AdaLIN) function to adaptively select a proper ratio between IN and LN. Through the AdaLIN, our attention-guided model can flexibly control the amount of change in shape and texture.

For details, see this paper [^ 1] and the commentary article [^ 2], and when you actually convert using the trained model published on the above page, it looks like this.

image.png

It doesn't seem to convert very well if the subject is far away. Also, my uncle doesn't seem to be able to handle it very well. The dataset used in the training is also published on the above page, but it seems to be biased towards young women, so this seems to be the cause. (I'm not an uncle yet, is it impossible?)

Implementation overview

As mentioned above, it seems necessary to make the image close to the subject (person's face) (≒ face occupies most of the screen). This time, I tried the procedure of identifying the location of the face with the face detection function introduced up to the last time, cutting out the location, and converting it with UGATIT.

image.png

See the repositories mentioned below for implementation details. [^ 3] [^ 3]: As of April 2, 2020, the source code is dirty because it is made by adding and adding. Refactor somewhere)))

Environment

Please prepare v4l2loopback and face recognition model by referring to Articles up to the last time.

Also, as before, clone the script from the repository below and install the required modules.

$ git clone https://github.com/dannadori/WebCamHooker.git
$ cd WebCamHooker/
$ pip3 install -r requirements.txt

Placement of UGATIT trained models

UGATIT officially provides source code for Tensorflow and PyTorch, but it seems that the only trained model is the Tensorflow version. Get this and deploy it. In addition, it seems that the extraction fails with the normal zip extraction tool of Windows or Linux. There is a report in the issue that 7zip works well when using Windows. Also, it doesn't seem to cause any problems on Mac. The solution for Linux is unknown ... [^ 4] [^ 4]: All as of April 2, 2020

For the time being, the hash value (md5sum) of the model that works normally is described. (Maybe this is the number one stumbling block.)

$ find . -type f |xargs -I{} md5sum {}
43a47eb34ad056427457b1f8452e3f79  ./UGATIT.model-1000000.data-00000-of-00001
388e18fe2d6cedab8b1dbaefdddab4da  ./UGATIT.model-1000000.meta
a08353525ecf78c4b6b33b0b2ab2b75c  ./UGATIT.model-1000000.index
f8c38782b22e3c4c61d4937316cd3493  ./checkpoint

Store these files in UGATIT / checkpoint of the folder cloned from git above. If it looks like this, it's OK.

$ ls UGATIT/checkpoint/ -1
UGATIT.model-1000000.data-00000-of-00001
UGATIT.model-1000000.index
UGATIT.model-1000000.meta
checkpoint

Let's have a video conference!

The execution is as follows. One option has been added.

--Enter the actual webcam device number in input_video_num. For / dev / video0, enter the trailing 0. --Specify the device file of the virtual webcam device for output_video_dev. --Set anime_mode to True.

In addition, please use ctrl + c to end it.

$ python3 webcamhooker.py --input_video_num 0 --output_video_dev /dev/video2 --anime_mode True

When you execute the above command, ffmpeg will run and the video will start to be delivered to the virtual camera device.

As before, when you have a video conference, you will see something like dummy ~ ~ in the list of video devices, so select it. This is an example of Teams. The conversion source image is also displayed by wiping in the upper right corner of the screen. It will be converted to an anime style and delivered more than I expected. However, it is very heavy, and if it is a slightly old PC, it is at the level of 1 frame per second (Intel (R) Core (TM) i7-4770 CPU @ 3.40GHz, 32G RAM). It may be difficult to operate normally. For now, I think it's a punch line worker. I would like to try it on GPU eventually.

out3.gif

Finally

It may be difficult to communicate casually because working from home is prolonged, but I think it would be good to bring this kind of playfulness to video conferencing and activate conversations. I think we can do more, so please give it a try.

Recommended Posts

How to process camera images with Teams and Zoom Volume of processing in animation style
How to process camera images with Teams or Zoom
How to put OpenCV in Raspberry Pi and easily collect images of face detection results with Python
How to display multiple images of galaxies in tiles
Parallel processing of Python joblib does not work in uWSGI environment. How to process in parallel on uWSGI?
Comparison of how to use higher-order functions in Python 2 and 3
How to create dataframes and mess with elements in pandas
How to insert a specific process at the start and end of spider with scrapy
How to log in to AtCoder with Python and submit automatically
How to make a surveillance camera (Security Camera) with Opencv and Python
I tried to process the image in "sketch style" with OpenCV
I tried to process the image in "pencil style" with OpenCV
Node.js: How to kill offspring of a process started with child_process.fork ()
How to collect images in Python
How to deal with garbled characters in json of Django REST Framework
How to search using python's astroquery and get fits images with skyview
How to work with BigQuery in Python
(Diary 1) How to create, reference, and register data in the SQL database of Microsoft Azure service with python
How to write this process in Perl?
How to use is and == in Python
How to view images in Django's Admin
How to draw OpenCV images in Pygame
How to pass the path to the library built with pyenv and virtualenv in PyCharm
How to count the number of elements in Django and output to a template
How to apply updlock, rowlock, etc. with a combination of SQLAlchemy and SQLServer
How to read a serial number file in a loop, process it, and graph it
How to get a list of files in the same directory with python
How to deal with memory leaks in matplotlib.pyplot
[REAPER] How to play with Reascript in Python
How to keep track of work in Powershell
How to do multi-core parallel processing with python
How to generate permutations in Python and C ++
How to achieve time wait processing with wxpython
Convert PDFs to images in bulk with Python
Summary of how to import files in Python 3
Coexistence of Fcitx and Zoom ~ With Japanese localization ~
How to deal with run-time errors in subprocess.call
Summary of how to use MNIST in Python
Script to tweet with multiples of 3 and numbers with 3 !!
How to specify attributes with Mock of python
How to implement "named_scope" of RubyOnRails with Django
How to use tkinter with python in pyenv
How to display images continuously with matplotlib Note
Wavelet transform of images with PyWavelets and OpenCV
How to write async and await in Vue.js
How to plot autocorrelation and partial autocorrelation in python
How to achieve access by attribute and retention of insertion order in Python dict
[Linux] How to display CPU usage, display header, and not display grep process with ps command
How to identify the element with the smallest number of characters in a Python list?
[Note] How to write QR code and description in the same image with python
How to count the number of occurrences of each element in the list in Python with weight
I tried to compare the processing speed with dplyr of R and pandas of Python
Check the processing time and the number of calls for each process in python (cProfile)