[PYTHON] I tried to extract a line art from an image with Deep Learning

I tried to extract a line art from an image with Deep Learning

When I noticed, GW passed by in a blink of an eye. This year's GW walked around Tokyo (including some Saitama) with the whole family. I climbed the Sky Tree, but I didn't think there were so many people. It seemed interesting to go at night if I had to go next. It looks like it's being screened.

6/11 update

I modified the network structure and various things in the dataset to improve the line art that can be extracted, so I updated it including the contents.

I want a line art

As I wrote in the previous article, I tried line art coloring with Deep Learning, but in the process, I felt that there were some problems with the data set. The line art itself was created with reference to http://qiita.com/khsk/items/6cf4bae0166e4b12b942, but there seem to be some problems.

Well, there are some problems, but I am still working on it because it can be created easily (OpenCV should work) and at high speed (OpenCV is abbreviated).

Research to turn rough into line art

Some of the existing research is as follows. http://hi.cs.waseda.ac.jp:8081/

This is a dissertation. http://hi.cs.waseda.ac.jp/~esimo/publications/SimoSerraSIGGRAPH2016.pdf

This is a technique for AutoEncoder that converts rough to line art. This time, I made it with reference to this.

class AutoEncoder(object):
    """Define autoencoder"""

    def __init__(self):
        self.conv1 = Encoder(3, 48, 5, 5, strides=[1, 2, 2, 1], name='encoder1')
        self.conv1_f1 = Encoder(48, 128, 3, 3, name='encoder1_flat1')
        self.conv1_f2 = Encoder(128, 128, 3, 3, name='encoder1_flat2')
        self.conv2 = Encoder(128, 256, 5, 5, strides=[1, 2, 2, 1], name='encoder2')
        self.conv2_f1 = Encoder(256, 256, 3, 3, name='encoder2_flat1')
        self.conv2_f2 = Encoder(256, 256, 3, 3, name='encoder2_flat2')
        self.conv3 = Encoder(256, 256, 5, 5, strides=[1, 2, 2, 1], name='encoder3')
        self.conv3_f1 = Encoder(256, 512, 3, 3, name='encoder3_flat1')
        self.conv3_f2 = Encoder(512, 1024, 3, 3, name='encoder3_flat2')
        self.conv3_f3 = Encoder(1024, 512, 3, 3, name='encoder3_flat3')
        self.conv3_f4 = Encoder(512, 256, 3, 3, name='encoder3_flat4')

        self.bnc1 = op.BatchNormalization(name='bnc1')
        self.bnc1_f1 = op.BatchNormalization(name='bnc1_flat1')
        self.bnc1_f2 = op.BatchNormalization(name='bnc1_flat2')
        self.bnc2 = op.BatchNormalization(name='bnc2')
        self.bnc2_f1 = op.BatchNormalization(name='bnc2_flat1')
        self.bnc2_f2 = op.BatchNormalization(name='bnc2_flat2')
        self.bnc3 = op.BatchNormalization(name='bnc3')
        self.bnc3_f1 = op.BatchNormalization(name='bnc3_flat1')
        self.bnc3_f2 = op.BatchNormalization(name='bnc3_flat2')
        self.bnc3_f3 = op.BatchNormalization(name='bnc3_flat3')
        self.bnc3_f4 = op.BatchNormalization(name='bnc3_flat4')

        self.deconv1 = Decoder(256, 256, 4, 4, strides=[1, 2, 2, 1], name='decoder1')
        self.deconv1_f1 = Encoder(256, 128, 3, 3, name='decoder1_flat1')
        self.deconv1_f2 = Encoder(128, 128, 3, 3, name='decoder1_flat2')
        self.deconv2 = Decoder(128, 128, 4, 4, strides=[1, 2, 2, 1], name='decoder2')
        self.deconv2_f1 = Encoder(128, 128, 3, 3, name='decoder2_flat1')
        self.deconv2_f2 = Encoder(128, 48, 3, 3, name='decoder2_flat2')
        self.deconv3 = Decoder(48, 48, 4, 4, strides=[1, 2, 2, 1], name='decoder3')
        self.deconv3_f1 = Decoder(48, 24, 3, 3, name='decoder3_flat1')
        self.deconv3_f2 = Decoder(24, 1, 3, 3, name='decoder3_flat2')

        self.bnd1 = op.BatchNormalization(name='bnd1')
        self.bnd1_f1 = op.BatchNormalization(name='bnd1_flat1')
        self.bnd1_f2 = op.BatchNormalization(name='bnd1_flat2')
        self.bnd2 = op.BatchNormalization(name='bnd2')
        self.bnd2_f1 = op.BatchNormalization(name='bnd2_flat1')
        self.bnd2_f2 = op.BatchNormalization(name='bnd2_flat2')
        self.bnd3 = op.BatchNormalization(name='bnd3')
        self.bnd3_f1 = op.BatchNormalization(name='bnd3_flat1')


def autoencoder(images, height, width):
    """make autoencoder network"""

    AE = AutoEncoder()

    def div(v, d):
        return max(1, v // d)

    relu = tf.nn.relu
    net = relu(AE.bnc1(AE.conv1(images, [height, width])))
    net = relu(AE.bnc1_f1(AE.conv1_f1(net, [div(height, 2), div(width, 2)])))
    net = relu(AE.bnc1_f2(AE.conv1_f2(net, [div(height, 2), div(width, 2)])))
    net = relu(AE.bnc2(AE.conv2(net, [div(height, 2), div(width, 2)])))
    net = relu(AE.bnc2_f1(AE.conv2_f1(net, [div(height, 4), div(width, 4)])))
    net = relu(AE.bnc2_f2(AE.conv2_f2(net, [div(height, 4), div(width, 4)])))
    net = relu(AE.bnc3(AE.conv3(net, [div(height, 4), div(width, 4)])))
    net = relu(AE.bnc3_f1(AE.conv3_f1(net, [div(height, 8), div(width, 8)])))
    net = relu(AE.bnc3_f2(AE.conv3_f2(net, [div(height, 8), div(width, 8)])))
    net = relu(AE.bnc3_f3(AE.conv3_f3(net, [div(height, 8), div(width, 8)])))
    net = relu(AE.bnc3_f4(AE.conv3_f4(net, [div(height, 8), div(width, 8)])))
    net = relu(AE.bnd1(AE.deconv1(net, [div(height, 4), div(width, 4)])))
    net = relu(AE.bnd1_f1(AE.deconv1_f1(net, [div(height, 4), div(width, 4)])))
    net = relu(AE.bnd1_f2(AE.deconv1_f2(net, [div(height, 4), div(width, 4)])))
    net = relu(AE.bnd2(AE.deconv2(net, [div(height, 2), div(width, 2)])))
    net = relu(AE.bnd2_f1(AE.deconv2_f1(net, [div(height, 2), div(width, 2)])))
    net = relu(AE.bnd2_f2(AE.deconv2_f2(net, [div(height, 2), div(width, 2)])))
    net = relu(AE.bnd3(AE.deconv3(net, [height, width])))
    net = relu(AE.bnd3_f1(AE.deconv3_f1(net, [height, width])))

    net = tf.nn.sigmoid(AE.deconv3_f2(net, [height, width]))

    return net

AutoEncoder looks like this. I wondered if I could get it somehow. In the paper, it seems that the focus is on the method called loss map, but because I don't know how to refer to the histogram in Tensorflow, that part is implemented.

I tried it

The network used was about 250,000 times with the following parameters given.

I gave up on the huge size because I couldn't survive even with 2GiB memory in the first place. It seems good to reduce it once, extract it, and then insert an encoder to increase the resolution.

Image of reasonable size

Mikon! I painted it. Amezuku @ Looking for a job

All the original images are borrowed because I painted them with Pixiv. This image has no original line art, and there is no comparison target in the first place, so that's it. It took about 10 seconds for both this size and GPU, so I honestly don't want to think about doing it with a CPU.

I think it's pretty solid. The average size of the images in the dataset is about 1000 \ * 1000, which is quite large, so even large images can be handled. However, unfortunately there is a jaggedness peculiar to the lower part ... This can not be said because there are times when it comes out and times when it does not come out. output1.png

Thumbnail size

The size is 256 \ * 256. For the time being, I will also include the version I pulled out with OpenCV. Ignore the tsukkomi that 256 \ * 256 is thumbnailed.

Original picture. At first glance, it seems like a line will appear, but ... small_origin.jpeg

OpenCV version. Since all the fine color tones have come out, I can't deny the feeling that it's a line art or grayscale. small_opencv.jpeg

This network version. It's partly suspicious (or rather, the hand part is impossible to play), but the influence of the shadow can be ignored quite properly, and the expression of the hair part is simple, so it's a miso in the foreground, but it feels pretty good. Is it not? Too fine details such as frills are crushed by the stone, but it seems that this can be solved little by little if you continue learning a little more. small_cnn.jpeg

About dataset

The dataset used for learning is basically obtained from the categories painted by Pixiv. The most important thing in making a data set was that ** the line art and the colored image had the same aspect **. If this shifts, it will be like not being able to learn in the first place, so I collected them while checking each one.

Also, I sometimes found that even if the aspect ratio was the same, there was a slight difference between the line art and the colored image **. I had to omit it properly because learning would not proceed if there was this too.

Weaknesses

It depends on how you paint it, but it's a simpler line than OpenCV, and I think it feels like the details aren't lost that much. However, due to the structure or the nature of what we are doing, there are drawbacks, including the fact that there is nothing we can do about it.

Basically, rather than using it as it is, I think it will be more like processing based on this.

Summary

Generating networks such as AutoEncoder are interesting. I also want to challenge myself to give parameters and change the way line art is made.

It is difficult to investigate and implement, but since amateurs can do deep learning, we recommend that you invest a little and try it (on your own or in the cloud).

Recommended Posts

I tried to extract a line art from an image with Deep Learning
I tried to divide with a deep learning language model
Try to extract a character string from an image with Python3
[Deep Learning from scratch] I tried to explain Dropout
I tried to implement Perceptron Part 1 [Deep Learning from scratch]
"Deep Learning from scratch" Self-study memo (No. 16) I tried to build SimpleConvNet with Keras
A beginner tried coloring line art with chainer. I was able to do it.
I tried to make deep learning scalable with Spark × Keras × Docker
[Deep Learning from scratch] I tried to explain the gradient confirmation in an easy-to-understand manner.
I tried to cut out a still image from the video
[Machine learning] I tried to do something like passing an image
I tried to make an image similarity function with Python + OpenCV
I tried to implement deep learning that is not deep with only NumPy
[Deep Learning from scratch] I tried to implement sigmoid layer and Relu layer.
I tried to make "Sakurai-san" a LINE BOT with API Gateway + Lambda
I tried to send a registration completion email from Gmail with django.
I tried to get an image by scraping
I tried to detect an object with M2Det!
I tried deep learning
[Python] Deep Learning: I tried to implement deep learning (DBN, SDA) without using a library.
I tried to implement Cifar10 with SONY Deep Learning library NNabla [Nippon Hurray]
I tried to make a simple image recognition API with Fast API and Tensorflow
I tried to make deep learning scalable with Spark × Keras × Docker 2 Multi-host edition
I want to convert an image to WebP with lollipop
I tried to create a table only with Django
I tried to extract features with SIFT of OpenCV
I tried to detect the iris from the camera image
I want to climb a mountain with reinforcement learning
I tried to implement an artificial perceptron with python
I tried to automatically generate a password with Python3
I tried collecting data from a website with Scrapy
I tried to make an OCR application with PySimpleGUI
I tried to find an alternating series with tensorflow
An introduction to machine learning from a simple perceptron
I tried to compress the image using machine learning
I tried to build an environment with WSL + Ubuntu + VS Code in a Windows environment
I tried to make a real-time sound source separation mock with Python machine learning
I tried to predict horse racing by doing everything from data collection to deep learning
I tried to create a reinforcement learning environment for Othello with Open AI gym
Image processing with Python (I tried binarizing it into a mosaic art of 0 and 1)
I tried to make creative art with AI! I programmed a novelty! (Paper: Creative Adversarial Network)
I tried to build an environment for machine learning with Python (Mac OS X)
I tried to implement a volume moving average with Quantx
I tried to find the entropy of the image with python
I tried to extract characters from subtitles (OpenCV: tesseract-ocr edition)
Try to build a deep learning / neural network with scratch
I tried sending an email from Amazon SES with Python
I tried to automatically create a report with Markov chain
Create an environment for "Deep Learning from scratch" with Docker
I tried hosting a TensorFlow deep learning model using TensorFlow Serving
I tried to solve a combination optimization problem with Qiskit
I tried to get started with Hy ・ Define a class
Mayungo's Python Learning Episode 3: I tried to print numbers with print
I want to install a package from requirements.txt with poetry
I want to send a message from Python to LINE Bot
I tried to implement ListNet of rank learning with Chainer
I tried to sort a random FizzBuzz column with bubble sort.
I captured the Touhou Project with Deep Learning ... I wanted to.
I tried to create an article in Wiki.js with SQLAlchemy
I tried it with SymPyLive by referring to "[Ruby] Getting unique elements from an array".
I made a server with Python socket and ssl and tried to access it from a browser