[PYTHON] Schreiben Sie ein Restnetzwerk mit TFLearn

Was ist ein Restnetzwerk?

Modell, das den ILSVRC2015 (Global General Image Recognition Contest) gewonnen hat Im Vergleich zu einem System wie VGG Net ist der Rechenaufwand gering, und es scheint einfacher zu sein, durch einfaches Vertiefen der Ebene Genauigkeit zu erzielen. Siehe unten für Details

Deep Residual Learning (ILSVRC2015 winner) [Survey]Deep Residual Learning for Image Recognition keras-resnet

Installation

Benötigt TensorFlow 0.9 oder höher (TF Learn Installation)

$ pip install tflearn

Klicken Sie hier für die neueste Version

$ pip install git+https://github.com/tflearn/tflearn.git

Residual Network

Restblock und Restengpass sind in der Ebene von TFLearn implementiert, verwenden Sie sie also einfach.

13.08.2016: Der Fehler beim Schreiben des Restflaschenhalses wurde behoben. Ein Fehler tritt auf, wenn ~~ downsample = True. Ich werde den Code reparieren, wenn die Ursache gefunden wird. (Betrachtet man die Implementierung von TFLearns Residual Block, wenn downsample = True, Strides = 2 in der ersten Conv2D und dann AvgPool2D am Ende. Ist das richtig?) ~~ *
15.08.2016: Zuerst habe ich mich nur mit der Implementierung von Keras und TFLearn befasst, aber [Deep Learning Framework zum Vergleich mit der Implementierung von Deep Residual Learning (ResNet)](http://ttlg.hateblo.jp/entry/ 2016/06/06/180 806) wurde die Ursache durch Vergleich mehrerer Implementierungen gefunden. * *
Original Feature Map beim Downsampling, *
1. Die Größe wird um Kernel Size = 1, Strides = 2 und die Anzahl der Kanäle durch Füllen mit Null-Padding reduziert. *
1. Implementierung, die die Form durch Falten von KernelSize = 1, Strides = 2 ändert *
Es scheint zu geben *
Die Keras-Implementierung übernimmt die letztere, die TFLearn-Implementierung übernimmt die erstere, aber zumindest in der von mir verwendeten Version von TensorFlow (0.9.0) tritt ein Fehler aufgrund der Einschränkung der Kernelgröße> = Schritte beim durchschnittlichen Pooling auf. Es war. Wenn AveragePooling auf KernelSize = Strides eingestellt ist, werden beide wie unten gezeigt implementiert und umgeschaltet. Die als Implementierung von TensorFlow eingeführte Version war Max Pooling mit KernelSize = Strides. * *
Ich bin jedoch der Meinung, dass die Art und Weise, wie Informationen gelöscht werden, in beiden Implementierungen von 1 und 2 grob ist. Funktioniert dies, wenn Sie lernen, die Größe in jedem Block nach und nach zu verringern? * *
20.08.2016: Die Implementierung von TensorFlow war MaxPooling, und es scheint, dass es näher an der ursprünglichen Absicht liegt, also habe ich zu MaxPooling gewechselt und verschiedene Experimente durchgeführt. Dann auf TensorFlow 0.10 aktualisiert. * *

`cifar10.py`


# -*- coding: utf-8 -*-

from __future__ import (
    absolute_import,
    division,
    print_function
)

from six.moves import range

import tensorflow as tf
import tflearn
from tflearn.datasets import cifar10

nb_first_filter = 64
reputation_list = [8, 8]
# 'basic' => Residual Block, 'deep' => Residual Bottleneck
residual_mode = 'basic'
# 'padding' => Zero Padding, 'shortcut' => Projection Shortcut
downsample_mode = 'padding'

nb_class = 10

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
y_train = tflearn.data_utils.to_categorical(y_train, 10)
y_test = tflearn.data_utils.to_categorical(y_test, 10)

# Real-time data preprocessing
img_prep = tflearn.ImagePreprocessing()
img_prep.add_featurewise_zero_center(per_channel=True)

# Real-time data augmentation
img_aug = tflearn.ImageAugmentation()
img_aug.add_random_flip_leftright()
img_aug.add_random_crop([32, 32], padding=4)

def residual_net(inputs, nb_first_filter, reputation_list, residual_mode='basic', activation='relu'):
    net = tflearn.conv_2d(inputs, nb_first_filter, 7, strides=2)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.max_pool_2d(net, 3, strides=2)
    for i, nb_shortcut in enumerate(reputation_list):
        if i == 0:
            if residual_mode == 'basic':
                net = tflearn.residual_block(net, nb_shortcut, nb_first_filter, activation=activation)
            elif residual_mode == 'deep':
                net = tflearn.residual_bottleneck(net, nb_shortcut, nb_first_filter, nb_first_filter * 4, activation=activation)
            else:
                raise Exception('Residual mode should be basic/deep')
        else:
            nb_filter = nb_first_filter * 2**i
            if residual_mode == 'basic':
                net = tflearn.residual_block(net, 1, nb_filter, activation=activation, downsample=True)
                net = tflearn.residual_block(net, nb_shortcut - 1, nb_filter, activation=activation)
            else:
                net = tflearn.residual_bottleneck(net, 1, nb_filter, nb_filter * 4, activation=activation, downsample=True)
                net = tflearn.residual_bottleneck(net, nb_shortcut - 1, nb_filter, nb_filter * 4, activation=activation)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.global_avg_pool(net)
    net = tflearn.fully_connected(net, nb_class, activation='softmax')
    return net

net = tflearn.input_data(shape=[None, 32, 32, 3], data_preprocessing=img_prep, data_augmentation=img_aug)
net = residual_net(net, nb_first_filter, reputation_list, residual_mode=residual_mode)
net = tflearn.regression(net, optimizer='adam', loss='categorical_crossentropy')
model = tflearn.DNN(net, checkpoint_path='model_resnet_cifar10',
                    max_checkpoints=10, tensorboard_verbose=0)

model.fit(X_train, y_train, n_epoch=200, validation_set=(X_test, y_test),
          snapshot_epoch=False, snapshot_step=500,
          show_metric=True, batch_size=128, shuffle=True,
          run_id='resnet_cifar10')

# For TensorFlow 0.9
def residual_block(incoming, nb_blocks, out_channels, downsample=False,
                   downsample_strides=2, activation='relu', batch_norm=True,
                   bias=True, weights_init='variance_scaling', bias_init='zeros',
                   regularizer='L2', weight_decay=0.0001, trainable=True,
                   restore=True, reuse=False, scope=None, name='ResidualBlock'):
    resnet = incoming
    in_channels = incoming.get_shape().as_list()[-1]

    with tf.variable_op_scope([incoming], scope, name, reuse=reuse) as scope:
        name = scope.name

        for i in range(nb_blocks):

            identity = resnet

            if not downsample:
                downsample_strides = 1

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, out_channels, 3, downsample_strides,
                                     'same', 'linear', bias, weights_init,
                                     bias_init, regularizer, weight_decay,
                                     trainable, restore)

            if downsample_mode == 'original':
                if downsample_strides > 1 or in_channels != out_channels:
                    identity = resnet
                    in_channels = out_channels

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, out_channels, 3, 1, 'same',
                                     'linear', bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            if downsample_mode == 'padding':
                # Downsampling
                if downsample_strides > 1:
                    identity = tflearn.max_pool_2d(identity, downsample_strides, downsample_strides)

                # Projection to new dimension
                if in_channels != out_channels:
                    ch = (out_channels - in_channels)//2
                    identity = tf.pad(identity, [[0, 0], [0, 0], [0, 0], [ch, ch]])
                    in_channels = out_channels
            elif downsample_mode == 'shortcut':
                if downsample_strides > 1 or in_channels != out_channels:
                    identity = tflearn.conv_2d(identity, out_channels, 1, downsample_strides, 'same')
                    in_channels = out_channels
            elif downsample_mode == 'original':
                pass
            else:
                raise Exception('Downsample mode should be padding/shortcut')

            resnet = resnet + identity

    return resnet

def residual_bottleneck(incoming, nb_blocks, bottleneck_size, out_channels,
                        downsample=False, downsample_strides=2, activation='relu',
                        batch_norm=True, bias=True, weights_init='variance_scaling',
                        bias_init='zeros', regularizer='L2', weight_decay=0.0001,
                        trainable=True, restore=True, reuse=False, scope=None,
                        name="ResidualBottleneck"):
    resnet = incoming
    in_channels = incoming.get_shape().as_list()[-1]

    with tf.variable_op_scope([incoming], scope, name, reuse=reuse) as scope:
        name = scope.name

        for i in range(nb_blocks):

            identity = resnet

            if not downsample:
                downsample_strides = 1

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, bottleneck_size, 1,
                                     downsample_strides, 'valid', 'linear', bias,
                                     weights_init, bias_init, regularizer,
                                     weight_decay, trainable, restore)

            if downsample_mode == 'original':
                if downsample_strides > 1 or in_channels != out_channels:
                    identity = resnet
                    in_channels = out_channels

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, bottleneck_size, 3, 1, 'same',
                                     'linear', bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, out_channels, 1, 1, 'valid',
                                     'linear', bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            if downsample_mode == 'padding':
                # Downsampling
                if downsample_strides > 1:
                    identity = tflearn.max_pool_2d(identity, downsample_strides, downsample_strides)

                # Projection to new dimension
                if in_channels != out_channels:
                    ch = (out_channels - in_channels)//2
                    identity = tf.pad(identity, [[0, 0], [0, 0], [0, 0], [ch, ch]])
                    in_channels = out_channels
            elif downsample_mode == 'shortcut':
                if downsample_strides > 1 or in_channels != out_channels:
                    identity = tflearn.conv_2d(identity, out_channels, 1, downsample_strides, 'same')
                    in_channels = out_channels
            elif downsample_mode == 'original':
                pass
            else:
                raise Exception('Downsample mode should be padding/shortcut')

    return resnet

tflearn.residual_block = residual_block
tflearn.residual_bottleneck = residual_bottleneck

Referenz: Residual_network_cifar10.py

Experiment

Es gab einige unklare Punkte, also versuchte ich es zu überprüfen Das Thema ist CIFAR10

Die zu vergleichende Grundform ist der obige Code residual_mode = 'basic' downsample_mode = 'padding'

Ausführungsergebnis Accuracy = 0.8516

Erste Faltungskerngröße

Ich habe empirisch versucht, ob 7 gut ist oder ob es von der Bildgröße abhängt Die Größe des ersten Kernels wurde in die Größe der letzten Feature-Map geändert

    # net = tflearn.conv_2d(inputs, nb_first_filter, 7, strides=2)
    side = inputs.get_shape().as_list()[1]
    first_kernel = side // 2**(len(reputation_list) + 1)
    net = tflearn.conv_2d(inputs, nb_first_filter, first_kernel, strides=2)

Ausführungsergebnis Accuracy = 0.8506

Was zum Teufel ist das 7?

So reduzieren Sie die Feature-Map des ersten Restblocks

Ich habe versucht, warum ich Max Pooling nur hier benutze und warum es gefaltet werden kann

    # net = tflearn.max_pool_2d(net, 3, strides=2)
    net = tflearn.conv_2d(net, nb_first_filter, 3, strides=2)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)

Ausführungsergebnis Accuracy = 0.8634

Es scheint ein wenig effektiv zu sein, aber der Rechenaufwand steigt auch ein wenig

Downsampling implementieren

downsample_mode = 'shortcut'

Im Folgenden ist es schwierig zu optimieren, ob das Downsample verschlungen ist. Da die ursprüngliche Methode jedoch nicht einfach implementiert werden kann (in TensorFlow 0.9), vergleichen Sie Faltung und Max Pooling.

[Survey]Identity Mappings in Deep Residual Networks

Ausführungsergebnis Accuracy = 0.8385

Unterschied zwischen Restblock und Engpass

residual_mode = 'deep'

Ausführungsergebnis Accuracy = 0.8333

Werden sich die Ergebnisse ändern, wenn die Schichten tiefer werden?

Was passiert, wenn die letzte Schicht vollständig verbunden ist?

Ich habe von GlobalAverage Pooling zu zwei vollständig verbundenen Schichten mit 512 Knoten gewechselt

    # net = tflearn.global_avg_pool(net)
    net = tflearn.fully_connected(net, 512)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.fully_connected(net, 512)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)

Ausführungsergebnis Accuracy = 0.8412

Die Trainingsgenauigkeit war hier höher, daher scheint das globale durchschnittliche Pooling besser zu sein

TensorFlow 0.10.0 (ursprüngliches Restnetz)

Ausführungsergebnis Accuracy = 0.8520

Stochastic Depth

~~ Ich wollte auch Stochastic Depth implementieren, was einen Dropout-ähnlichen Effekt auf Residual Net hat, aber ich hatte das Gefühl, dass ich verschiedene Dinge mit rohem TensorFlow implementieren musste, also werde ich es diesmal nicht tun ~~

20.08.2016: Stochastische Tiefe implementiert

Siehe unten für die stochastische Tiefe

[Survey]Deep Networks with Stochastic Depth stochastic_depth_keras

`residual_network_with_stochastic_depth.py`


# -*- coding: utf-8 -*-

from __future__ import (
    absolute_import,
    division,
    print_function
)

from six.moves import range

import tensorflow as tf
import tflearn
from tflearn.datasets import cifar10

nb_first_filter = 64
reputation_list = [8, 8]
# 'basic' => Residual Block, 'deep' => Residual Bottleneck
residual_mode = 'basic'
# 'linear' => Linear Decay, 'uniform' => Uniform, 'none' => None
stochastic_depth_mode = 'linear'
stochastic_skip = 0.5

nb_class = 10

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
y_train = tflearn.data_utils.to_categorical(y_train, 10)
y_test = tflearn.data_utils.to_categorical(y_test, 10)

# Real-time data preprocessing
img_prep = tflearn.ImagePreprocessing()
img_prep.add_featurewise_zero_center(per_channel=True)

# Real-time data augmentation
img_aug = tflearn.ImageAugmentation()
img_aug.add_random_flip_leftright()
img_aug.add_random_crop([32, 32], padding=4)

def addBlock(incoming, bottleneck_size, out_channels, threshold=0.0,
             residual_mode='basic', downsample=False, downsample_strides=2,
             activation='relu', batch_norm=True, bias=True,
             weights_init='variance_scaling', bias_init='zeros',
             regularizer='L2', weight_decay=0.0001, trainable=True,
             restore=True, reuse=False, scope=None):
    if residual_mode == 'basic':
        residual_path = tflearn.residual_block(
                        incoming, 1, out_channels, downsample=downsample,
                        downsample_strides=downsample_strides,
                        activation=activation, batch_norm=batch_norm, bias=bias,
                        weights_init=weights_init, bias_init=bias_init,
                        regularizer=regularizer, weight_decay=weight_decay,
                        trainable=trainable, restore=restore, reuse=reuse,
                        scope=scope)
    else:
        residual_path = tflearn.residual_bottleneck(
                        incoming, 1, bottleneck_size, out_channels,
                        downsample=downsample,
                        downsample_strides=downsample_strides,
                        activation=activation, batch_norm=batch_norm, bias=bias,
                        weights_init=weights_init, bias_init=bias_init,
                        regularizer=regularizer, weight_decay=weight_decay,
                        trainable=trainable, restore=restore, reuse=reuse,
                        scope=scope)
    if downsample:
        in_channels = incoming.get_shape().as_list()[-1]

        with tf.variable_op_scope([incoming], scope, 'Downsample', 
                                  reuse=reuse) as scope:
            name = scope.name
            # Downsampling
            inference = tflearn.avg_pool_2d(incoming, 1, downsample_strides)
            # Projection to new dimension
            if in_channels != out_channels:
                ch = (out_channels - in_channels)//2
                inference = tf.pad(inference, [[0, 0], [0, 0], [0, 0], [ch, ch]])
            # Track activations.
            tf.add_to_collection(tf.GraphKeys.ACTIVATIONS, inference)
        # Add attributes to Tensor to easy access weights
        inference.scope = scope
        # Track output tensor.
        tf.add_to_collection(tf.GraphKeys.LAYER_TENSOR + '/' + name, inference)

        skip_path = inference
    else:
        skip_path = incoming

    p = tf.random_uniform([1])[0]

    return tf.cond(p > threshold, lambda: residual_path, lambda: skip_path)

def residual_net(inputs, nb_first_filter, reputation_list, downsample_strides=2,
                 activation='relu', batch_norm=True, bias=True,
                 weights_init='variance_scaling', bias_init='zeros',
                 regularizer='L2', weight_decay=0.0001, trainable=True,
                 restore=True, reuse=False, scope=None, residual_mode='basic',
                 stochastic_depth_mode='linear', stochastic_skip=0.0,
                 is_training=True):
    if not is_training:
        stochastic_depth_mode = 'none'
        stochastic_skip = 0.0
    side = inputs.get_shape().as_list()[1]
    first_kernel = side // 2**(len(reputation_list) + 1)

    net = tflearn.conv_2d(inputs, nb_first_filter, first_kernel, strides=2)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)

    net = tflearn.max_pool_2d(net, 3, strides=2)

    block_total = sum(reputation_list)
    block_current = 0
    for i, nb_block in enumerate(reputation_list):
        nb_filter = nb_first_filter * 2**i

        assert stochastic_depth_mode in ['linear', 'uniform', 'none'], 'Stochastic depth mode should be linear/uniform/none'
        assert residual_mode in ['basic', 'deep'], 'Residual mode should be basic/deep'
        for j in range(nb_block):
            block_current += 1

            if stochastic_depth_mode == 'linear':
                threshold = stochastic_skip * block_current / block_total
            else:
                threshold = stochastic_skip

            bottleneck_size = nb_filter
            if residual_mode == 'basic':
                out_channels = nb_filter
            else:
                out_channels = nb_filter * 4
            if i != 0 and j == 0:
                downsample = True
            else:
                downsample = False
            net = addBlock(net, bottleneck_size, out_channels,
                           downsample=downsample, threshold=threshold,
                           residual_mode=residual_mode,
                           downsample_strides=downsample_strides,
                           activation=activation, batch_norm=batch_norm,
                           bias=bias, weights_init=weights_init,
                           bias_init=bias_init, regularizer=regularizer,
                           weight_decay=weight_decay, trainable=trainable,
                           restore=restore, reuse=reuse, scope=scope)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.global_avg_pool(net)
    net = tflearn.fully_connected(net, nb_class, activation='softmax')
    return net

inputs = tflearn.input_data(shape=[None, 32, 32, 3],
                            data_preprocessing=img_prep,
                            data_augmentation=img_aug)
net = residual_net(inputs, nb_first_filter, reputation_list,
                   residual_mode=residual_mode,
                   stochastic_depth_mode=stochastic_depth_mode,
                   stochastic_skip=stochastic_skip)
net = tflearn.regression(net, optimizer='adam', loss='categorical_crossentropy')
model = tflearn.DNN(net, checkpoint_path='model_resnet_cifar10',
                    max_checkpoints=10, tensorboard_verbose=0)

model.fit(X_train, y_train, n_epoch=200, snapshot_epoch=False, snapshot_step=500,
          show_metric=True, batch_size=128, shuffle=True, run_id='resnet_cifar10')

`residual_network_with_stochastic_depth_test.py`


inputs = tflearn.input_data(shape=[None, 32, 32, 3])
net_test = residual_net(inputs, nb_first_filter, reputation_list,
                        residual_mode=residual_mode,
                        stochastic_depth_mode=stochastic_depth_mode,
                        is_training=False)
model_test = tflearn.DNN(net)
model_test.load('model_resent_cifar10-xxxxx') # set the latest number
print(model_test.evaluate(X_test, y_test))

tf.cond scheint beide unabhängig von der Wahrheit der Bedingung auszuführen, so dass es keinen Effekt zu haben scheint, den Rechenaufwand zu reduzieren Dieses Mal muss ich den Wert von is_training in Training und Validierungstest ändern, aber ich konnte keinen Weg finden, dies in derselben Schritt-Epoche zu realisieren, sodass der Test separat ausgeführt wird Es schien, dass ich lernen konnte, aber das Verhalten war seltsam, weil der GPU-Speicherbereich auf dem Weg seltsam wurde, also hielt ich an und versuchte den Test mit dem Modell auf halbem Weg. Standardmäßig wird TFLearn wahrscheinlich versuchen, alles in dasselbe Diagramm zu schreiben. Wenn Sie also ein Testmodell in derselben Datei erstellen, werden die Ebenen umbenannt und nicht gut geladen. Es sollte kein Problem geben, wenn Sie get_weight / set_weight oder tf.Graph verwenden. Wenn Sie dies jedoch tun, wird die Benutzerfreundlichkeit von TFLearn verringert, sodass die Dateien schnell getrennt und separat ausgeführt werden können. Als ich den obigen Testcode ausführte, bekam ich einen Fehler und konnte ihn nicht auswerten Predict schien gut zu funktionieren, ist es also ein TF Learn-Fehler? Gib diese Zeit vorerst auf

Serpentin

Wenn ich mir den Code anschaue, denke ich, dass es ein Bild ist, nach und nach notwendige Informationen an einer bestimmten Position zu sammeln (unnötige Dinge zu verwerfen) und sie mit 1x1-Pooling zu sammeln. Wenn ich dann das 1x1-Pooling auf die Filterrichtung ausweitete, fragte ich mich, ob notwendige Filter und unnötige Filter aussortiert und die Anzahl der Filter gespeichert werden könnten, also habe ich es versucht.

`residual_network_with_kernel_pooling.py`


# -*- coding: utf-8 -*-

from __future__ import (
    absolute_import,
    division,
    print_function
)

from six.moves import range

import tensorflow as tf
import tflearn
from tflearn.datasets import cifar10

nb_filter = 64
reputation_list = [8, 8]
# 'basic' => Residual Block, 'deep' => Residual Bottleneck
residual_mode = 'basic'

nb_class = 10

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
y_train = tflearn.data_utils.to_categorical(y_train, 10)
y_test = tflearn.data_utils.to_categorical(y_test, 10)

# Real-time data preprocessing
img_prep = tflearn.ImagePreprocessing()
img_prep.add_featurewise_zero_center(per_channel=True)

# Real-time data augmentation
img_aug = tflearn.ImageAugmentation()
img_aug.add_random_flip_leftright()
img_aug.add_random_crop([32, 32], padding=4)

def avg_1x1pool_2d_all(incoming, kernel_size, strides, padding='same',
                       name='Avg1x1Pool2DAll'):
    input_shape = tflearn.utils.get_incoming_shape(incoming)
    assert len(input_shape) == 4, "Incoming Tensor shape must be 4-D"

    if isinstance(kernel_size, int):
        kernel = [1, kernel_size, kernel_size, kernel_size]
    elif isinstance(kernel_size, (tuple, list)):
        if len(kernel_size) == 3:
            kernel = [1, strides[0], strides[1], strides[2]]
        elif len(kernel_size) == 4:
            kernel = [strides[0], strides[1], strides[2], strides[3]]
        else:
            raise Exception("strides length error: " + str(len(strides))
                            + ", only a length of 3 or 4 is supported.")
    if isinstance(strides, int):
        strides = [1, strides, strides, strides]
    elif isinstance(strides, (tuple, list)):
        if len(strides) == 3:
            strides = [1, strides[0], strides[1], strides[2]]
        elif len(strides) == 4:
            strides = [strides[0], strides[1], strides[2], strides[3]]
        else:
            raise Exception("strides length error: " + str(len(strides))
                            + ", only a length of 3 or 4 is supported.")
    else:
        raise Exception("strides format error: " + str(type(strides)))
    padding = tflearn.utils.autoformat_padding(padding)

    with tf.name_scope(name) as scope:
        inference = tf.nn.avg_pool(incoming, kernel, strides, padding)

        # Track activations.
        tf.add_to_collection(tf.GraphKeys.ACTIVATIONS, inference)

    # Add attributes to Tensor to easy access weights
    inference.scope = scope

    # Track output tensor.
    tf.add_to_collection(tf.GraphKeys.LAYER_TENSOR + '/' + name, inference)

    return inference

def residual_block(incoming, nb_blocks, downsample=False, downsample_strides=2,
                   activation='relu', batch_norm=True, bias=True,
                   weights_init='variance_scaling', bias_init='zeros',
                   regularizer='L2', weight_decay=0.0001, trainable=True,
                   restore=True, reuse=False, scope=None, name="ResidualBlock"):
    resnet = incoming
    in_channels = incoming.get_shape().as_list()[-1]

    with tf.variable_op_scope([incoming], scope, name, reuse=reuse) as scope:
        name = scope.name

        for i in range(nb_blocks):

            identity = resnet

            if not downsample:
                downsample_strides = 1

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, in_channels, 3, downsample_strides,
                                     'same', 'linear', bias, weights_init,
                                     bias_init, regularizer, weight_decay,
                                     trainable, restore)

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, in_channels, 3, 1, 'same',
                                     'linear', bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            # Downsampling
            if downsample_strides > 1:
                identity = avg_1x1pool_2d_all(identity, 1, downsample_strides)

                # Projection to new dimension
                current_channels = identity.get_shape().as_list()[-1]
                ch = (in_channels - current_channels)//2
                identity = tf.pad(identity, [[0, 0], [0, 0], [0, 0], [ch, ch]])

            resnet = resnet + identity

        # Track activations.
        tf.add_to_collection(tf.GraphKeys.ACTIVATIONS, resnet)

    # Add attributes to Tensor to easy access weights.
    resnet.scope = scope

    # Track output tensor.
    tf.add_to_collection(tf.GraphKeys.LAYER_TENSOR + '/' + name, resnet)

    return resnet

def residual_bottleneck(incoming, nb_blocks, out_channels, downsample=False,
                        downsample_strides=2, activation='relu',
                        batch_norm=True, bias=True,
                        weights_init='variance_scaling', bias_init='zeros',
                        regularizer='L2', weight_decay=0.0001, trainable=True,
                        restore=True, reuse=False, scope=None,
                        name="ResidualBottleneck"):
    resnet = incoming
    in_channels = incoming.get_shape().as_list()[-1]

    with tf.variable_op_scope([incoming], scope, name, reuse=reuse) as scope:
        name = scope.name

        for i in range(nb_blocks):

            identity = resnet

            if not downsample:
                downsample_strides = 1

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, in_channels, 1, downsample_strides,
                                     'valid', 'linear', bias, weights_init,
                                     bias_init, regularizer, weight_decay,
                                     trainable, restore)

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, in_channels, 3, 1, 'same',
                                     'linear', bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            resnet = tflearn.conv_2d(resnet, out_channels, 1, 1, 'valid',
                                     activation, bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            # Downsampling
            if downsample_strides > 1:
                identity = avg_1x1pool_2d_all(identity, 1, downsample_strides)

            # Projection to new dimension
            current_channels = identity.get_shape().as_list()[-1]
            ch = (out_channels - current_channels)//2
            identity = tf.pad(identity, [[0, 0], [0, 0], [0, 0], [ch, ch]])

            resnet = resnet + identity

        # Track activations.
        tf.add_to_collection(tf.GraphKeys.ACTIVATIONS, resnet)

    # Add attributes to Tensor to easy access weights.
    resnet.scope = scope

    # Track output tensor.
    tf.add_to_collection(tf.GraphKeys.LAYER_TENSOR + '/' + name, resnet)

    return resnet

tflearn.residual_block = residual_block
tflearn.residual_bottleneck = residual_bottleneck

def residual_net(inputs, nb_filter, reputation_list, residual_mode='basic', activation='relu'):
    net = tflearn.conv_2d(inputs, nb_filter, 7, strides=2)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.max_pool_2d(net, 3, strides=2)

    assert residual_mode in ['basic', 'deep'], 'Residual mode should be basic/deep'
    for i, nb_block in enumerate(reputation_list):
        for j in range(nb_block):
            downsample = True if i != 0 and j == 0 else False

            if residual_mode == 'basic':
                net = tflearn.residual_block(net, 1, activation=activation,
                                             downsample=downsample)
            else:
                net = tflearn.residual_bottleneck(net, 1, nb_filter * 4,
                                                  activation=activation,
                                                  downsample=downsample)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.global_avg_pool(net)

    net = tflearn.fully_connected(net, nb_class, activation='softmax')
    return net

net = tflearn.input_data(shape=[None, 32, 32, 3], data_preprocessing=img_prep, data_augmentation=img_aug)
net = residual_net(net, nb_filter, reputation_list, residual_mode=residual_mode)
net = tflearn.regression(net, optimizer='adam', loss='categorical_crossentropy')
model = tflearn.DNN(net, checkpoint_path='model_resnet_cifar10',
                    max_checkpoints=10, tensorboard_verbose=0)

model.fit(X_train, y_train, n_epoch=200, validation_set=(X_test, y_test),
          snapshot_epoch=False, snapshot_step=500,
          show_metric=True, batch_size=128, shuffle=True,
          run_id='resnet_cifar10')

Ausführungsergebnis

ValueError: Current implementation does not support strides in the batch and depth dimensions.

Ich war wütend auf TensorFlow, weil ich etwas extra gemacht habe ... Ich weiß es nicht, aber kann ich mit Maxout etwas Ähnliches machen? Gib diese Zeit vorerst auf