[PYTHON] I tried using PI Fu to generate a 3D model of a person from one image

What is PIFu

image.png Image quote (left): Sumire Uesaka Official Blog Nekomori Rally

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

Roughly speaking,

** A machine learning model that generates a 3D model of a person with clothes from a single image **

is.

We introduce Pixel-aligned Implicit Function (PIFu), a highly effective implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a single image, and optionally, multiple input images. Highly intricate shapes, such as hairstyles, clothing, as well as their variations and deformations can be digitized in a unified way.

Introduce the Pixel-aligned Implicit Function (PIFu). This is a very effective implicit representation that aligns the pixels of a 2D image locally in the global context of the corresponding 3D object. We propose an end-to-end deep learning method for digitizing highly detailed clothing that can infer both 3D surfaces and textures from a single image and optionally multiple input images using PIFu. .. Very complex shapes such as hairstyles and clothing, as well as their variations and variants, can be digitized in a unified way.

Installation method & tutorial

The installation method is simple.

$ git clone https://github.com/shunsukesaito/PIFu.git
$ cd PIFu
$ pip install -r requirements.txt
$ sh ./scripts/download_trained_model.sh

PIFu comes with a sample dataset that you can easily get it working with.

$ sh ./scripts/test.sh

Doing so will output the file results / pifu_demo / result_ryota.obj.

image.png

MeshLab is recommended when viewing 3D models. The reason is that the model output by PIFu has no texture and is colored by VertexColor. It is recommended because there are few viewers who can see the model colored with this Vertex Color and it is easy to see.

Generate 3D model with specified image

There are two things you need to do to generate a 3D model in PIFu.

  1. Prepare a square image
  2. Preparation of mask image

This time, from the free material Pakutaso (www.pakutaso.com), Free image (photo) of the yukata glasses boy (whole body) who puts his hand on the sleeve .html) is used.

image.png

Since the original image is vertically long, add a band to make it a square image. Let's call this kimono.png.

image.png

Then generate a mask image. Let's call this kimono_mask.png. ** The name is important here. Be sure to add _mask to the mask image. ** **

image.png

Then create a kimono / folder and copy the two files.

mkdir kimono/
cp kimono.png kimono/
cp kimono_mask.png kimono/

Create the following content as scripts / eval.sh.

scripts/eval.sh


#!/usr/bin/env bash
set -ex

# Training
GPU_ID=0
DISPLAY_ID=$((GPU_ID*10+10))
NAME='pifu_demo'

# Network configuration

BATCH_SIZE=1
MLP_DIM='257 1024 512 256 128 1'
MLP_DIM_COLOR='513 1024 512 256 128 3'

TEST_FOLDER_PATH=$1
shift

# Reconstruction resolution
# NOTE: one can change here to reconstruct mesh in a different resolution.
VOL_RES=$1
shift

CHECKPOINTS_NETG_PATH='./checkpoints/net_G'
CHECKPOINTS_NETC_PATH='./checkpoints/net_C'

# command
CUDA_VISIBLE_DEVICES=${GPU_ID} python ./apps/eval.py \
    --name ${NAME} \
    --batch_size ${BATCH_SIZE} \
    --mlp_dim ${MLP_DIM} \
    --mlp_dim_color ${MLP_DIM_COLOR} \
    --num_stack 4 \
    --num_hourglass 2 \
    --resolution ${VOL_RES} \
    --hg_down 'ave_pool' \
    --norm 'group' \
    --norm_color 'group' \
    --test_folder_path ${TEST_FOLDER_PATH} \
    --load_netG_checkpoint_path ${CHECKPOINTS_NETG_PATH} \
    --load_netC_checkpoint_path ${CHECKPOINTS_NETC_PATH}

Finally,

$ sh scripts/eval_default.sh kimono/ 256

By doing so, results / pifu_demo / result_kimono.obj will be generated.

image.png

Escape PIFu

There is a method called PIFu. This is a PIFu ** that I made to create high quality textures. (I just named it to distinguish it from the original family.) It's just a way out, and there are some things that are a little strange. Well, there are various circumstances, so I will explain it later.

Left: Original image Medium: PIFu default Right: PIFu

image.png

It's a branch called 2_phase_generate in my PIFu repository.

https://github.com/kotauchisunsun/PIFu/tree/2_phase_generate

In this branch, you can output with scripts / eval_two_phase.sh. As for how to use

./scripts/eval_two_phase.sh IMAGE_DIR/  VOXEL_RESOLUTION VOXEL_LOAD_SIZE TEX_LOAD_SIZE

It's like that. IMAGE_DIR / is the directory containing the images. VOXEL_RESOLUTION is recommended around 512,1024. If it is 1024, it will bring about 20GB of memory, so match that area to the machine. It is recommended to fix VOXEL_LOAD_SIZE to 512. Set TEX_LOAD_SIZE to 1024 or 2048 according to the resolution of the texture. This is a good idea to get a model with high quality texture.

So, which area is illegal? It is a story. Well, it looks like ** using non-regular behavior **. For details, see Pull Request, but originally VOXEL_LOAD_SIZE and TEX_LOAD_SIZE should not be specified except 512. about it. However, when I set TEX_LOAD_SIZE to 1024 and output it, it is troublesome that ** a beautiful model has been created **. At first, I thought, "If I put an invalid value in TEX_LOAD_SIZE, I would die" or "If it moves, the texture will be shattered", so I modified it appropriately, but it looks like that. It came out beautifully. It has come out. So, I made a pull request, but it seems that it was originally useless. As a matter of fact, the texture behind is rather shredded. Left: PIFu Right: PIFu

image.png

As the author said, if you want a high quality texture, why not simply project it? It is said that it may be so. Actually, PIFu also has a function to project textures, but it feels like I saw the code, and it is impossible to output in high resolution, so I think that modification is essential.

Impressions

I am happy that I was able to introduce Sumire. PIFu has known its existence since last year and wondered when the code would be released, but I was surprised that it came out unexpectedly early. Also, it was relatively easy to move, so I'm glad I was able to do it quickly. However, I'm wondering if I can make it feel a little better. Sonic Boom Sonic Boom Wesaka Kawaii.

Recommended Posts

I tried using PI Fu to generate a 3D model of a person from one image
I tried to automatically generate a port management table from Config of L2SW
I want to collect a lot of images, so I tried using "google image download"
I tried to automate [a certain task] using Raspberry Pi
I tried to get a database of horse racing using Pandas
I tried to make a regular expression of "amount" using Python
I tried to make a regular expression of "time" using Python
I tried to implement anomaly detection using a hidden Markov model
I tried to make a regular expression of "date" using Python
I tried to cut out a still image from the video
I tried to easily create a high-precision 3D image with one photo [3]. MiDaS of feat. Intel-isl only depth without permission.
I tried 3D detection of a car
I tried to get the batting results of Hachinai using image processing
I came up with a way to make a 3D model from a photo.
ConSinGAN: I tried using GAN that can be generated from one image
I tried to perform a cluster analysis of customers using purchasing data
I tried to correct the keystone of the image
I tried to generate a random character string
I tried using the image filter of OpenCV
I tried to make a ○ ✕ game using TensorFlow
[Python] I tried to judge the member image of the idol group using Keras
I tried to extract a line art from an image with Deep Learning
I tried to make a suspicious person MAP quickly using Geolonia address data
I tried to create a model with the sample of Amazon SageMaker Autopilot
I tried to easily create a high-precision 3D image with one photo [-1]. (Is the hidden area really visible?)
I came up with a way to create a 3D model from a photo Part 02 Image loading and vertex drawing
I tried to detect the iris from the camera image
I tried hosting a Pytorch sample model using TorchServe
I tried to automatically generate a password with Python3
PyTorch Learning Note 2 (I tried using a pre-trained model)
I tried to draw a configuration diagram using Diagrams
I tried to compress the image using machine learning
I tried to make something like a chatbot with the Seq2Seq model of TensorFlow
I tried to notify the update of "Become a novelist" using "IFTTT" and "Become a novelist API"
I tried to sort out the objects from the image of the steak set meal-④ Clustering
I tried to extract the text in the image file using Tesseract of the OCR engine
I came up with a way to make a 3D model from a photo. 0 Projection to 3D space
I tried to make a function to retrieve data from database column by column using sql with sqlite3 of python [sqlite3, sql, pandas]
I tried to easily create a high-precision 3D image with one photo [1]. (Depth can now be edited in PNG.)
I tried to easily create a high-precision 3D image with one photo [0]. (Confirmed how to capture the space, put a net)
I tried to find the entropy of the image with python
I came up with a way to create a 3D model from a photo Part 04 Polygon generation
I tried to automatically generate OGP of a blog made with Hugo with tcardgen made by Go
I tried refactoring the CNN model of TensorFlow using TF-Slim
I tried to compare the accuracy of machine learning models using kaggle as a theme.
I want to start a lot of processes from python
I tried using the Pi Console I / F of the Raspberry Pi IoT starter kit "anyPi" from Mechatrax.
I made a function to check the model of DCGAN
I tried to automate "one heart even if separated" using a genetic algorithm in Python
I tried to make a motion detection surveillance camera with OpenCV using a WEB camera with Raspberry Pi
I tried to make a stopwatch using tkinter in python
I tried to sort out the objects from the image of the steak set meal-① Object detection
Implementation of recommendation system ~ I tried to find the similarity from the outline of the movie using TF-IDF ~
I tried to divide with a deep learning language model
I tried to make a simple text editor using PyQt
I tried to automate the construction of a hands-on environment using IBM Cloud's SoftLayer API
A person who wants to clear the D problem with ABC of AtCoder tried to scratch
I tried to get data from AS / 400 quickly using pypyodbc
I made a Line bot that guesses the gender and age of a person from an image
I tried to predict the number of domestically infected people of the new corona with a mathematical model
I tried to sort out the objects from the image of the steak set meal-② Overlap number sorting