Was ist neuronale Stilkonvertierung?

Die Konvertierung des neuronalen Stils ist eine der Techniken des maschinellen Lernens, mit der ein Zielbild in den Stil (die Textur) eines anderen Bildes konvertiert wird, um ein neues Bild zu generieren. Diese Technologie wird in Apps verwendet, die Bilder von Städten und Menschen im Goch-Stil ändern. スクリーンショット 2020-03-24 15.13.40.png Mit dieser Technologie möchte ich nun versuchen, ein echtes, wildes Krokodilbild, das Menschen zu fressen scheint, in einen Krokodilstil umzuwandeln, der in 100 Tagen stirbt, und ihn in ein sanft lächelndes Krokodil umzuwandeln. (Trotzdem ist Nii-chan in diesem Bild in Ordnung ... Wenn du nicht aufpasst, wirst du sterben!) Im Grunde behalten Sie den ursprünglichen Bildinhalt bei (Makrostruktur wie das Skelett des Bildes) und integrieren dann einen Cartoon-Touch-Stil (Textur) im 100-Krokodil-Stil. Beim Deep Learning versuchen wir immer, das Ziel zu erreichen, indem wir eine Verlustfunktion definieren, die angibt, was wir erreichen möchten, und diese Verlustfunktion minimieren. Hier ist ein Bild der Verlustfunktion, die ich minimieren möchte, was in diesem Beispiel sehr grob ist.

Verlustfunktion = (durch echten Krokodilbildinhalt erzeugter Bildinhalt) + (100 durch Krokodilstil erzeugter Bildstil)

Der Quellcode stammt aus diesem Buch von Keras Schöpfer ↓. Es ist fast wie in diesem Buch. Wenn Sie also an Details interessiert sind, kaufen Sie es bitte. [Deep Learning mit Python und Keras](https://www.amazon.co.jp/Python%E3%81%A8Keras%E3%81%AB%E3%82%88%E3%82%8B%E3%83] % 87% E3% 82% A3% E3% 83% BC% E3% 83% 97% E3% 83% A9% E3% 83% BC% E3% 83% 8B% E3% 83% B3% E3% 82% B0 -Francois-Chollet / dp / 4839964262 / ref = sr_1_1? __Mk_ja_JP =% E3% 82% AB% E3% 82% BF% E3% 82% AB% E3% 83% 8A & crid = 1HSLD7YJT37UJ & dchild = 1 & keywords = python + kers books & sprefix = python + keras% 2Caps% 2C268 & sr = 1-1) Quellcode

Umgebung

Verwenden Sie Google Colab. Es ist keine Konfiguration erforderlich und Sie können kostenlos auf die GPU zugreifen, sodass Sie Bilder problemlos verarbeiten können. Speichern Sie das Bild, das Sie in Google Drive verwenden möchten, und laden Sie das Bild aus Google Colab.

Konvertierung des neuronalen Stils mit Keras in Google Colab

Speichern Sie zunächst das Zielbild und das Stilbild, das Sie in Google Drive verarbeiten möchten. Öffnen Sie nach dem Speichern das Notizbuch mit Google Colab. Führen Sie daher ↓ aus, um auf Google Drive zuzugreifen, und erlauben Sie den Zugriff auf der Google Drive-Seite. Dann erhalten Sie den Autorisierungscode. Geben Sie ihn in das Formular ein, das nach der Ausführung des folgenden Codes angezeigt wird.

from google.colab import drive
drive.mount('/content/drive')

Definieren Sie als Nächstes den Pfad des Bildes. Verarbeiten Sie die verarbeiteten Bilder so, dass sie dieselbe Größe haben.

import keras
keras.__version__
from keras.preprocessing.image import load_img, img_to_array

#Der Pfad des Zielbildes. Schreiben Sie den Pfad an den Speicherort, den Sie gespeichert haben.
target_image_path = '/content/drive/My Drive/Colab Notebooks/wani/wani2.png'
#Stil Bildpfad. Schreiben Sie den Pfad an den Speicherort, den Sie gespeichert haben.
style_reference_image_path = '/content/drive/My Drive/Colab Notebooks/wani/100wani.png'

#Generierte Bildgröße
width, height = load_img(target_image_path).size
img_height = 400
img_width = int(width * img_height / height)

Erstellen Sie als Nächstes eine Zusatzfunktion, die die mit VGG19 ausgetauschten Bilder liest, vorverarbeitet und nachbearbeitet.

import numpy as np
from keras.applications import vgg19

def preprocess_image(image_path):
    img = load_img(image_path, target_size=(img_height, img_width))
    img = img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img = vgg19.preprocess_input(img)
    return img

def deprocess_image(x):
    x[:, :, 0] += 103.939
    x[:, :, 1] += 116.779
    x[:, :, 2] += 123.68
    x = x[:, :, ::-1]
    x = np.clip(x, 0, 255).astype('uint8')
    return x

Definieren Sie als Nächstes VGG19.

from keras import backend as K

target_image = K.constant(preprocess_image(target_image_path))
style_reference_image = K.constant(preprocess_image(style_reference_image_path))

#Platzhalter für das generierte Bild
combination_image = K.placeholder((1, img_height, img_width, 3))

#Kombinieren Sie drei Bilder zu einem Stapel
input_tensor = K.concatenate([target_image,
                              style_reference_image,
                              combination_image], axis=0)

#Erstellen Sie VGG19 mit einem Stapel von 3 Bildern als Eingabe
#Dieses Modell lädt die trainierten ImageNet-Gewichte
model = vgg19.VGG19(input_tensor=input_tensor,
                    weights='imagenet',
                    include_top=False)
print('Model loaded.')

Definieren Sie die Verlustfunktion.

#Inhaltsverlustfunktion
def content_loss(base, combination):
    return K.sum(K.square(combination - base))

#Stilverlustfunktion
def gram_matrix(x):
    features = K.batch_flatten(K.permute_dimensions(x, (2, 0, 1)))
    gram = K.dot(features, K.transpose(features))
    return gram

def style_loss(style, combination):
    S = gram_matrix(style)
    C = gram_matrix(combination)
    channels = 3
    size = img_height * img_width
    return K.sum(K.square(S - C)) / (4. * (channels ** 2) * (size ** 2))

#Total Fluktuationsverlustfunktion
def total_variation_loss(x):
    a = K.square(
        x[:, :img_height - 1, :img_width - 1, :] - x[:, 1:, :img_width - 1, :])
    b = K.square(
        x[:, :img_height - 1, :img_width - 1, :] - x[:, :img_height - 1, 1:, :])
    return K.sum(K.pow(a + b, 1.25))

Definieren Sie die endgültige Verlustfunktion (gewichteter Durchschnitt dieser drei Funktionen), die minimiert werden soll.

outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])
content_layer = 'block5_conv2'
style_layers = ['block1_conv1',
                'block2_conv1',
                'block3_conv1',
                'block4_conv1',
                'block5_conv1']

total_variation_weight = 1e-4
style_weight = 1.
content_weight = 0.025

loss = K.variable(0.)
layer_features = outputs_dict[content_layer]
target_image_features = layer_features[0, :, :, :]
combination_features = layer_features[2, :, :, :]
loss += content_weight * content_loss(target_image_features,
                                      combination_features)
for layer_name in style_layers:
    layer_features = outputs_dict[layer_name]
    style_reference_features = layer_features[1, :, :, :]
    combination_features = layer_features[2, :, :, :]
    sl = style_loss(style_reference_features, combination_features)
    loss += (style_weight / len(style_layers)) * sl
loss += total_variation_weight * total_variation_loss(combination_image)

Definieren Sie den Prozess des Gradientenabfalls

grads = K.gradients(loss, combination_image)[0]
fetch_loss_and_grads = K.function([combination_image], [loss, grads])

class Evaluator(object):

    def __init__(self):
        self.loss_value = None
        self.grads_values = None

    def loss(self, x):
        assert self.loss_value is None
        x = x.reshape((1, img_height, img_width, 3))
        outs = fetch_loss_and_grads([x])
        loss_value = outs[0]
        grad_values = outs[1].flatten().astype('float64')
        self.loss_value = loss_value
        self.grad_values = grad_values
        return self.loss_value

    def grads(self, x):
        assert self.loss_value is not None
        grad_values = np.copy(self.grad_values)
        self.loss_value = None
        self.grad_values = None
        return grad_values

evaluator = Evaluator()

Endlich ist es soweit!

from scipy.optimize import fmin_l_bfgs_b
#from scipy.misc import imsave
import imageio
import time

result_prefix = 'style_transfer_result'
iterations = 30

# Run scipy-based optimization (L-BFGS) over the pixels of the generated image
# so as to minimize the neural style loss.
# This is our initial state: the target image.
# Note that `scipy.optimize.fmin_l_bfgs_b` can only process flat vectors.
x = preprocess_image(target_image_path)
x = x.flatten()
for i in range(iterations):
    print('Start of iteration', i)
    start_time = time.time()
    x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x,
                                     fprime=evaluator.grads, maxfun=20)
    print('Current loss value:', min_val)
    # Save current generated image
    img = x.copy().reshape((img_height, img_width, 3))
    img = deprocess_image(img)
    fname = result_prefix + '_at_iteration_%d.png' % i
    #imsave(fname, img)
    imageio.imwrite(fname, img)
    end_time = time.time()
    print('Image saved as', fname)
    print('Iteration %d completed in %ds' % (i, end_time - start_time))

Ausgabebild

from scipy.optimize import fmin_l_bfgs_b
from matplotlib import pyplot as plt

#Inhaltsbild
plt.imshow(load_img(target_image_path, target_size=(img_height, img_width)))
plt.figure()

#Stilbild
plt.imshow(load_img(style_reference_image_path, target_size=(img_height, img_width)))
plt.figure()

#Generiertes Bild
plt.imshow(img)
plt.show()

Ausgabeergebnis

Das Ausgabeergebnis ist ,,,,,,,, Es ist anders als ich es mir vorgestellt hatte! !! !! !! !! Kein Pop und sanftes Krokodilgefühl! !! !! !! !! !! Nun, es gibt tiefes Lernen, aber ich denke, Sie hätten vorerst erfahren können, dass Sie tiefes Lernen ausprobieren können, wenn Sie Google Colab oder Keras einfach verwenden. Mit diesem Code können Sie verschiedene Bildverarbeitungen selbst ausprobieren. Probieren Sie es also bitte aus.

Wirklich Keras ist unglaublich. Auch dieser Code lautet [Deep Learning mit Python und Keras](https://www.amazon.co.jp/Python%E3%81%A8Keras%E3%81%AB%E3%82%88] % E3% 82% 8B% E3% 83% 87% E3% 82% A3% E3% 83% BC% E3% 83% 97% E3% 83% A9% E3% 83% BC% E3% 83% 8B% E3 % 83% B3% E3% 82% B0-Francois-Chollet / dp / 4839964262 / ref = sr_1_1? __Mk_ja_JP =% E3% 82% AB% E3% 82% BF% E3% 82% AB% E3% 83% 8A & crid = 1HSLD7YJT37UJ & dchild = 1 & keywords = python + keras & qid = 1558530509 & s = books & sprefix = python + keras% 2Caps% 2C268 & sr = 1-1) ↓ Dieser Code wird verwendet. Quellcode

[PYTHON] Wandeln Sie mit Keras ein echtes wildes Krokodilbild in einen 100 krokodilartig lächelnden neuronalen Stil um

Was ist neuronale Stilkonvertierung?

Umgebung

Konvertierung des neuronalen Stils mit Keras in Google Colab

Ausgabeergebnis