[PYTHON] Ecrire un réseau résiduel avec TFLearn

Qu'est-ce que le réseau résiduel?

Modèle qui a remporté le ILSVRC2015 (Global General Image Recognition Contest) Comparé à un système comme VGG Net, la quantité de calcul est faible et il semble qu'il soit plus facile d'obtenir de la précision en approfondissant simplement la couche. Voir ci-dessous pour plus de détails

Deep Residual Learning (ILSVRC2015 winner) [Survey]Deep Residual Learning for Image Recognition keras-resnet

Installation

Nécessite TensorFlow 0.9 ou supérieur (TF Learn Installation)

$ pip install tflearn

Cliquez ici pour la dernière version

$ pip install git+https://github.com/tflearn/tflearn.git

Residual Network

Le bloc résiduel et le goulot d'étranglement résiduel sont implémentés dans la couche de TFLearn, alors utilisez-les simplement.

2016/8/13: Correction de l'erreur d'écriture du goulot de la bouteille résiduelle. Une erreur se produit lorsque ~~ downsample = True. Je corrigerai le code lorsque la cause sera trouvée. (En regardant l'implémentation du bloc résiduel de TF Learn, si sous-échantillonnage = True, Strides = 2 dans le premier Conv2D, puis AvgPool2D à la fin. Est-ce correct?) ~~ *
2016/8/15: Au début, je ne regardais que l'implémentation de Keras et TFLearn, mais [Deep learning framework to compare from the implementation of Deep Residual Learning (ResNet)](http://ttlg.hateblo.jp/entry/ 2016/06/06/180 806), la cause a été trouvée en comparant plusieurs implémentations. *
Carte des caractéristiques d'origine lors du sous-échantillonnage, *
1. La taille est réduite par Kernel Size = 1, Strides = 2, et le nombre de canaux est réduit en remplissant avec zéro padding *
1. Implémentation qui change la forme en convolvant KernelSize = 1, Strides = 2 *
Il semble y avoir *
L'implémentation Keras adopte ce dernier, l'implémentation TFLearn adopte la première, mais au moins dans la version de TensorFlow (0.9.0) que j'utilise, une erreur se produit en raison de la restriction de Taille du noyau> = Strides in Average Pooling. C'était. Avec AveragePooling défini sur KernelSize = Strides, les deux sont implémentés et commutés comme indiqué ci-dessous. Celui introduit comme implémentation de TensorFlow était Max Pooling avec KernelSize = Strides. *
Cependant, je pense que la manière dont les informations sont supprimées est approximative dans les deux implémentations de 1 et 2. Cela fonctionne-t-il si vous apprenez à réduire la taille petit à petit dans chaque bloc? *
2016/8/20: L'implémentation de TensorFlow était MaxPooling, et il semble que ce soit plus proche de l'intention d'origine, j'ai donc changé pour MaxPooling et expérimenté diversement. Puis mis à niveau vers TensorFlow 0.10. *

`cifar10.py`


# -*- coding: utf-8 -*-

from __future__ import (
    absolute_import,
    division,
    print_function
)

from six.moves import range

import tensorflow as tf
import tflearn
from tflearn.datasets import cifar10

nb_first_filter = 64
reputation_list = [8, 8]
# 'basic' => Residual Block, 'deep' => Residual Bottleneck
residual_mode = 'basic'
# 'padding' => Zero Padding, 'shortcut' => Projection Shortcut
downsample_mode = 'padding'

nb_class = 10

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
y_train = tflearn.data_utils.to_categorical(y_train, 10)
y_test = tflearn.data_utils.to_categorical(y_test, 10)

# Real-time data preprocessing
img_prep = tflearn.ImagePreprocessing()
img_prep.add_featurewise_zero_center(per_channel=True)

# Real-time data augmentation
img_aug = tflearn.ImageAugmentation()
img_aug.add_random_flip_leftright()
img_aug.add_random_crop([32, 32], padding=4)

def residual_net(inputs, nb_first_filter, reputation_list, residual_mode='basic', activation='relu'):
    net = tflearn.conv_2d(inputs, nb_first_filter, 7, strides=2)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.max_pool_2d(net, 3, strides=2)
    for i, nb_shortcut in enumerate(reputation_list):
        if i == 0:
            if residual_mode == 'basic':
                net = tflearn.residual_block(net, nb_shortcut, nb_first_filter, activation=activation)
            elif residual_mode == 'deep':
                net = tflearn.residual_bottleneck(net, nb_shortcut, nb_first_filter, nb_first_filter * 4, activation=activation)
            else:
                raise Exception('Residual mode should be basic/deep')
        else:
            nb_filter = nb_first_filter * 2**i
            if residual_mode == 'basic':
                net = tflearn.residual_block(net, 1, nb_filter, activation=activation, downsample=True)
                net = tflearn.residual_block(net, nb_shortcut - 1, nb_filter, activation=activation)
            else:
                net = tflearn.residual_bottleneck(net, 1, nb_filter, nb_filter * 4, activation=activation, downsample=True)
                net = tflearn.residual_bottleneck(net, nb_shortcut - 1, nb_filter, nb_filter * 4, activation=activation)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.global_avg_pool(net)
    net = tflearn.fully_connected(net, nb_class, activation='softmax')
    return net

net = tflearn.input_data(shape=[None, 32, 32, 3], data_preprocessing=img_prep, data_augmentation=img_aug)
net = residual_net(net, nb_first_filter, reputation_list, residual_mode=residual_mode)
net = tflearn.regression(net, optimizer='adam', loss='categorical_crossentropy')
model = tflearn.DNN(net, checkpoint_path='model_resnet_cifar10',
                    max_checkpoints=10, tensorboard_verbose=0)

model.fit(X_train, y_train, n_epoch=200, validation_set=(X_test, y_test),
          snapshot_epoch=False, snapshot_step=500,
          show_metric=True, batch_size=128, shuffle=True,
          run_id='resnet_cifar10')

# For TensorFlow 0.9
def residual_block(incoming, nb_blocks, out_channels, downsample=False,
                   downsample_strides=2, activation='relu', batch_norm=True,
                   bias=True, weights_init='variance_scaling', bias_init='zeros',
                   regularizer='L2', weight_decay=0.0001, trainable=True,
                   restore=True, reuse=False, scope=None, name='ResidualBlock'):
    resnet = incoming
    in_channels = incoming.get_shape().as_list()[-1]

    with tf.variable_op_scope([incoming], scope, name, reuse=reuse) as scope:
        name = scope.name

        for i in range(nb_blocks):

            identity = resnet

            if not downsample:
                downsample_strides = 1

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, out_channels, 3, downsample_strides,
                                     'same', 'linear', bias, weights_init,
                                     bias_init, regularizer, weight_decay,
                                     trainable, restore)

            if downsample_mode == 'original':
                if downsample_strides > 1 or in_channels != out_channels:
                    identity = resnet
                    in_channels = out_channels

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, out_channels, 3, 1, 'same',
                                     'linear', bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            if downsample_mode == 'padding':
                # Downsampling
                if downsample_strides > 1:
                    identity = tflearn.max_pool_2d(identity, downsample_strides, downsample_strides)

                # Projection to new dimension
                if in_channels != out_channels:
                    ch = (out_channels - in_channels)//2
                    identity = tf.pad(identity, [[0, 0], [0, 0], [0, 0], [ch, ch]])
                    in_channels = out_channels
            elif downsample_mode == 'shortcut':
                if downsample_strides > 1 or in_channels != out_channels:
                    identity = tflearn.conv_2d(identity, out_channels, 1, downsample_strides, 'same')
                    in_channels = out_channels
            elif downsample_mode == 'original':
                pass
            else:
                raise Exception('Downsample mode should be padding/shortcut')

            resnet = resnet + identity

    return resnet

def residual_bottleneck(incoming, nb_blocks, bottleneck_size, out_channels,
                        downsample=False, downsample_strides=2, activation='relu',
                        batch_norm=True, bias=True, weights_init='variance_scaling',
                        bias_init='zeros', regularizer='L2', weight_decay=0.0001,
                        trainable=True, restore=True, reuse=False, scope=None,
                        name="ResidualBottleneck"):
    resnet = incoming
    in_channels = incoming.get_shape().as_list()[-1]

    with tf.variable_op_scope([incoming], scope, name, reuse=reuse) as scope:
        name = scope.name

        for i in range(nb_blocks):

            identity = resnet

            if not downsample:
                downsample_strides = 1

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, bottleneck_size, 1,
                                     downsample_strides, 'valid', 'linear', bias,
                                     weights_init, bias_init, regularizer,
                                     weight_decay, trainable, restore)

            if downsample_mode == 'original':
                if downsample_strides > 1 or in_channels != out_channels:
                    identity = resnet
                    in_channels = out_channels

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, bottleneck_size, 3, 1, 'same',
                                     'linear', bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, out_channels, 1, 1, 'valid',
                                     'linear', bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            if downsample_mode == 'padding':
                # Downsampling
                if downsample_strides > 1:
                    identity = tflearn.max_pool_2d(identity, downsample_strides, downsample_strides)

                # Projection to new dimension
                if in_channels != out_channels:
                    ch = (out_channels - in_channels)//2
                    identity = tf.pad(identity, [[0, 0], [0, 0], [0, 0], [ch, ch]])
                    in_channels = out_channels
            elif downsample_mode == 'shortcut':
                if downsample_strides > 1 or in_channels != out_channels:
                    identity = tflearn.conv_2d(identity, out_channels, 1, downsample_strides, 'same')
                    in_channels = out_channels
            elif downsample_mode == 'original':
                pass
            else:
                raise Exception('Downsample mode should be padding/shortcut')

    return resnet

tflearn.residual_block = residual_block
tflearn.residual_bottleneck = residual_bottleneck

Référence: residuel_network_cifar10.py

Expérience

Il y avait des points peu clairs, alors j'ai essayé de vérifier Le sujet est CIFAR10

La forme de base à comparer est le code ci-dessus residual_mode = 'basic' downsample_mode = 'padding'

Résultat d'exécution Accuracy = 0.8516

Taille du premier noyau de convolution

J'ai essayé empiriquement si 7 était bon ou cela dépend de la taille de l'image Changement de la première taille de noyau à la dernière taille de carte d'entités

    # net = tflearn.conv_2d(inputs, nb_first_filter, 7, strides=2)
    side = inputs.get_shape().as_list()[1]
    first_kernel = side // 2**(len(reputation_list) + 1)
    net = tflearn.conv_2d(inputs, nb_first_filter, first_kernel, strides=2)

Résultat d'exécution Accuracy = 0.8506

C'est quoi ce 7?

Comment réduire la carte des caractéristiques du premier bloc résiduel

J'ai essayé pourquoi j'utilise Max Pooling uniquement ici et pourquoi il peut être plié

    # net = tflearn.max_pool_2d(net, 3, strides=2)
    net = tflearn.conv_2d(net, nb_first_filter, 3, strides=2)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)

Résultat d'exécution Accuracy = 0.8634

Cela semble un peu efficace, mais la quantité de calcul augmente aussi un peu

Comment mettre en œuvre le sous-échantillonnage

downsample_mode = 'shortcut'

Selon ce qui suit, il est difficile d'optimiser si le sous-échantillonnage est compliqué, mais comme la méthode d'origine ne peut pas être implémentée simplement (dans TensorFlow 0.9), comparez la convolution et le pool maximal.

[Survey]Identity Mappings in Deep Residual Networks

Résultat d'exécution Accuracy = 0.8385

Différence entre bloc résiduel et goulot d'étranglement

residual_mode = 'deep'

Résultat d'exécution Accuracy = 0.8333

Les résultats changeront-ils à mesure que les couches deviennent plus profondes?

Que se passe-t-il si le dernier est une couche entièrement connectée?

Je suis passé de GlobalAverage Pooling à deux couches entièrement connectées de 512 nœuds

    # net = tflearn.global_avg_pool(net)
    net = tflearn.fully_connected(net, 512)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.fully_connected(net, 512)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)

Résultat d'exécution Accuracy = 0.8412

La précision de la formation était plus élevée ici, donc la mise en commun moyenne globale semble être meilleure

TensorFlow 0.10.0 (filet résiduel d'origine)

Résultat d'exécution Accuracy = 0.8520

Stochastic Depth

~~ Je voulais également implémenter la profondeur stochastique, qui a un effet de type abandon sur Residual Net, mais j'ai senti que je devais implémenter diverses choses en utilisant TensorFlow brut, donc cette fois je ne le ferai pas ~~

2016/8/20: Implémentation de la profondeur stochastique

Voir ci-dessous pour la profondeur stochastique

[Survey]Deep Networks with Stochastic Depth stochastic_depth_keras

`residual_network_with_stochastic_depth.py`


# -*- coding: utf-8 -*-

from __future__ import (
    absolute_import,
    division,
    print_function
)

from six.moves import range

import tensorflow as tf
import tflearn
from tflearn.datasets import cifar10

nb_first_filter = 64
reputation_list = [8, 8]
# 'basic' => Residual Block, 'deep' => Residual Bottleneck
residual_mode = 'basic'
# 'linear' => Linear Decay, 'uniform' => Uniform, 'none' => None
stochastic_depth_mode = 'linear'
stochastic_skip = 0.5

nb_class = 10

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
y_train = tflearn.data_utils.to_categorical(y_train, 10)
y_test = tflearn.data_utils.to_categorical(y_test, 10)

# Real-time data preprocessing
img_prep = tflearn.ImagePreprocessing()
img_prep.add_featurewise_zero_center(per_channel=True)

# Real-time data augmentation
img_aug = tflearn.ImageAugmentation()
img_aug.add_random_flip_leftright()
img_aug.add_random_crop([32, 32], padding=4)

def addBlock(incoming, bottleneck_size, out_channels, threshold=0.0,
             residual_mode='basic', downsample=False, downsample_strides=2,
             activation='relu', batch_norm=True, bias=True,
             weights_init='variance_scaling', bias_init='zeros',
             regularizer='L2', weight_decay=0.0001, trainable=True,
             restore=True, reuse=False, scope=None):
    if residual_mode == 'basic':
        residual_path = tflearn.residual_block(
                        incoming, 1, out_channels, downsample=downsample,
                        downsample_strides=downsample_strides,
                        activation=activation, batch_norm=batch_norm, bias=bias,
                        weights_init=weights_init, bias_init=bias_init,
                        regularizer=regularizer, weight_decay=weight_decay,
                        trainable=trainable, restore=restore, reuse=reuse,
                        scope=scope)
    else:
        residual_path = tflearn.residual_bottleneck(
                        incoming, 1, bottleneck_size, out_channels,
                        downsample=downsample,
                        downsample_strides=downsample_strides,
                        activation=activation, batch_norm=batch_norm, bias=bias,
                        weights_init=weights_init, bias_init=bias_init,
                        regularizer=regularizer, weight_decay=weight_decay,
                        trainable=trainable, restore=restore, reuse=reuse,
                        scope=scope)
    if downsample:
        in_channels = incoming.get_shape().as_list()[-1]

        with tf.variable_op_scope([incoming], scope, 'Downsample', 
                                  reuse=reuse) as scope:
            name = scope.name
            # Downsampling
            inference = tflearn.avg_pool_2d(incoming, 1, downsample_strides)
            # Projection to new dimension
            if in_channels != out_channels:
                ch = (out_channels - in_channels)//2
                inference = tf.pad(inference, [[0, 0], [0, 0], [0, 0], [ch, ch]])
            # Track activations.
            tf.add_to_collection(tf.GraphKeys.ACTIVATIONS, inference)
        # Add attributes to Tensor to easy access weights
        inference.scope = scope
        # Track output tensor.
        tf.add_to_collection(tf.GraphKeys.LAYER_TENSOR + '/' + name, inference)

        skip_path = inference
    else:
        skip_path = incoming

    p = tf.random_uniform([1])[0]

    return tf.cond(p > threshold, lambda: residual_path, lambda: skip_path)

def residual_net(inputs, nb_first_filter, reputation_list, downsample_strides=2,
                 activation='relu', batch_norm=True, bias=True,
                 weights_init='variance_scaling', bias_init='zeros',
                 regularizer='L2', weight_decay=0.0001, trainable=True,
                 restore=True, reuse=False, scope=None, residual_mode='basic',
                 stochastic_depth_mode='linear', stochastic_skip=0.0,
                 is_training=True):
    if not is_training:
        stochastic_depth_mode = 'none'
        stochastic_skip = 0.0
    side = inputs.get_shape().as_list()[1]
    first_kernel = side // 2**(len(reputation_list) + 1)

    net = tflearn.conv_2d(inputs, nb_first_filter, first_kernel, strides=2)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)

    net = tflearn.max_pool_2d(net, 3, strides=2)

    block_total = sum(reputation_list)
    block_current = 0
    for i, nb_block in enumerate(reputation_list):
        nb_filter = nb_first_filter * 2**i

        assert stochastic_depth_mode in ['linear', 'uniform', 'none'], 'Stochastic depth mode should be linear/uniform/none'
        assert residual_mode in ['basic', 'deep'], 'Residual mode should be basic/deep'
        for j in range(nb_block):
            block_current += 1

            if stochastic_depth_mode == 'linear':
                threshold = stochastic_skip * block_current / block_total
            else:
                threshold = stochastic_skip

            bottleneck_size = nb_filter
            if residual_mode == 'basic':
                out_channels = nb_filter
            else:
                out_channels = nb_filter * 4
            if i != 0 and j == 0:
                downsample = True
            else:
                downsample = False
            net = addBlock(net, bottleneck_size, out_channels,
                           downsample=downsample, threshold=threshold,
                           residual_mode=residual_mode,
                           downsample_strides=downsample_strides,
                           activation=activation, batch_norm=batch_norm,
                           bias=bias, weights_init=weights_init,
                           bias_init=bias_init, regularizer=regularizer,
                           weight_decay=weight_decay, trainable=trainable,
                           restore=restore, reuse=reuse, scope=scope)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.global_avg_pool(net)
    net = tflearn.fully_connected(net, nb_class, activation='softmax')
    return net

inputs = tflearn.input_data(shape=[None, 32, 32, 3],
                            data_preprocessing=img_prep,
                            data_augmentation=img_aug)
net = residual_net(inputs, nb_first_filter, reputation_list,
                   residual_mode=residual_mode,
                   stochastic_depth_mode=stochastic_depth_mode,
                   stochastic_skip=stochastic_skip)
net = tflearn.regression(net, optimizer='adam', loss='categorical_crossentropy')
model = tflearn.DNN(net, checkpoint_path='model_resnet_cifar10',
                    max_checkpoints=10, tensorboard_verbose=0)

model.fit(X_train, y_train, n_epoch=200, snapshot_epoch=False, snapshot_step=500,
          show_metric=True, batch_size=128, shuffle=True, run_id='resnet_cifar10')

`residual_network_with_stochastic_depth_test.py`


inputs = tflearn.input_data(shape=[None, 32, 32, 3])
net_test = residual_net(inputs, nb_first_filter, reputation_list,
                        residual_mode=residual_mode,
                        stochastic_depth_mode=stochastic_depth_mode,
                        is_training=False)
model_test = tflearn.DNN(net)
model_test.load('model_resent_cifar10-xxxxx') # set the latest number
print(model_test.evaluate(X_test, y_test))

tf.cond semble exécuter les deux indépendamment de la véracité de la condition, il ne semble donc y avoir aucun effet de réduire la quantité de calcul Cette fois, je dois changer la valeur de is_training dans la formation et la validation · test, mais je n'ai pas pu trouver un moyen de le réaliser dans la même étape · époque, donc le test est exécuté séparément Il semblait que j'étais capable d'apprendre, mais le comportement était étrange car la zone de mémoire du GPU est devenue étrange en cours de route, alors je me suis arrêté et j'ai essayé le test avec le modèle à mi-chemin. Par défaut, TFLearn essaiera probablement de tout écrire dans le même graphique, donc la construction d'un modèle de test dans le même fichier renomme les couches et ne se charge pas correctement. Il ne devrait y avoir aucun problème si vous utilisez get_weight / set_weight ou tf.Graph, mais si vous le faites, la facilité d'utilisation de TFLearn sera diminuée, il est donc rapide de séparer les fichiers et de les exécuter séparément. Lorsque j'ai exécuté le code de test ci-dessus, j'ai eu une erreur et je n'ai pas pu l'évaluer Predict semble fonctionner correctement, est-ce donc un bogue TF Learn? Pour le moment, abandonne cette fois

Serpentin

En regardant le code, je pense que c'est une image de collecte des informations nécessaires petit à petit à une position spécifique (en supprimant les choses inutiles) et de les collecter avec une mise en commun 1x1. Ensuite, si j'étendais la mise en commun 1x1 à la direction du filtre, je me demandais si les filtres nécessaires et les filtres inutiles pouvaient être triés et le nombre de filtres pouvait être enregistré, alors je l'ai essayé.

`residual_network_with_kernel_pooling.py`


# -*- coding: utf-8 -*-

from __future__ import (
    absolute_import,
    division,
    print_function
)

from six.moves import range

import tensorflow as tf
import tflearn
from tflearn.datasets import cifar10

nb_filter = 64
reputation_list = [8, 8]
# 'basic' => Residual Block, 'deep' => Residual Bottleneck
residual_mode = 'basic'

nb_class = 10

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
y_train = tflearn.data_utils.to_categorical(y_train, 10)
y_test = tflearn.data_utils.to_categorical(y_test, 10)

# Real-time data preprocessing
img_prep = tflearn.ImagePreprocessing()
img_prep.add_featurewise_zero_center(per_channel=True)

# Real-time data augmentation
img_aug = tflearn.ImageAugmentation()
img_aug.add_random_flip_leftright()
img_aug.add_random_crop([32, 32], padding=4)

def avg_1x1pool_2d_all(incoming, kernel_size, strides, padding='same',
                       name='Avg1x1Pool2DAll'):
    input_shape = tflearn.utils.get_incoming_shape(incoming)
    assert len(input_shape) == 4, "Incoming Tensor shape must be 4-D"

    if isinstance(kernel_size, int):
        kernel = [1, kernel_size, kernel_size, kernel_size]
    elif isinstance(kernel_size, (tuple, list)):
        if len(kernel_size) == 3:
            kernel = [1, strides[0], strides[1], strides[2]]
        elif len(kernel_size) == 4:
            kernel = [strides[0], strides[1], strides[2], strides[3]]
        else:
            raise Exception("strides length error: " + str(len(strides))
                            + ", only a length of 3 or 4 is supported.")
    if isinstance(strides, int):
        strides = [1, strides, strides, strides]
    elif isinstance(strides, (tuple, list)):
        if len(strides) == 3:
            strides = [1, strides[0], strides[1], strides[2]]
        elif len(strides) == 4:
            strides = [strides[0], strides[1], strides[2], strides[3]]
        else:
            raise Exception("strides length error: " + str(len(strides))
                            + ", only a length of 3 or 4 is supported.")
    else:
        raise Exception("strides format error: " + str(type(strides)))
    padding = tflearn.utils.autoformat_padding(padding)

    with tf.name_scope(name) as scope:
        inference = tf.nn.avg_pool(incoming, kernel, strides, padding)

        # Track activations.
        tf.add_to_collection(tf.GraphKeys.ACTIVATIONS, inference)

    # Add attributes to Tensor to easy access weights
    inference.scope = scope

    # Track output tensor.
    tf.add_to_collection(tf.GraphKeys.LAYER_TENSOR + '/' + name, inference)

    return inference

def residual_block(incoming, nb_blocks, downsample=False, downsample_strides=2,
                   activation='relu', batch_norm=True, bias=True,
                   weights_init='variance_scaling', bias_init='zeros',
                   regularizer='L2', weight_decay=0.0001, trainable=True,
                   restore=True, reuse=False, scope=None, name="ResidualBlock"):
    resnet = incoming
    in_channels = incoming.get_shape().as_list()[-1]

    with tf.variable_op_scope([incoming], scope, name, reuse=reuse) as scope:
        name = scope.name

        for i in range(nb_blocks):

            identity = resnet

            if not downsample:
                downsample_strides = 1

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, in_channels, 3, downsample_strides,
                                     'same', 'linear', bias, weights_init,
                                     bias_init, regularizer, weight_decay,
                                     trainable, restore)

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, in_channels, 3, 1, 'same',
                                     'linear', bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            # Downsampling
            if downsample_strides > 1:
                identity = avg_1x1pool_2d_all(identity, 1, downsample_strides)

                # Projection to new dimension
                current_channels = identity.get_shape().as_list()[-1]
                ch = (in_channels - current_channels)//2
                identity = tf.pad(identity, [[0, 0], [0, 0], [0, 0], [ch, ch]])

            resnet = resnet + identity

        # Track activations.
        tf.add_to_collection(tf.GraphKeys.ACTIVATIONS, resnet)

    # Add attributes to Tensor to easy access weights.
    resnet.scope = scope

    # Track output tensor.
    tf.add_to_collection(tf.GraphKeys.LAYER_TENSOR + '/' + name, resnet)

    return resnet

def residual_bottleneck(incoming, nb_blocks, out_channels, downsample=False,
                        downsample_strides=2, activation='relu',
                        batch_norm=True, bias=True,
                        weights_init='variance_scaling', bias_init='zeros',
                        regularizer='L2', weight_decay=0.0001, trainable=True,
                        restore=True, reuse=False, scope=None,
                        name="ResidualBottleneck"):
    resnet = incoming
    in_channels = incoming.get_shape().as_list()[-1]

    with tf.variable_op_scope([incoming], scope, name, reuse=reuse) as scope:
        name = scope.name

        for i in range(nb_blocks):

            identity = resnet

            if not downsample:
                downsample_strides = 1

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, in_channels, 1, downsample_strides,
                                     'valid', 'linear', bias, weights_init,
                                     bias_init, regularizer, weight_decay,
                                     trainable, restore)

            if batch_norm:
                resnet = tflearn.batch_normalization(resnet)
            resnet = tflearn.activation(resnet, activation)

            resnet = tflearn.conv_2d(resnet, in_channels, 3, 1, 'same',
                                     'linear', bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            resnet = tflearn.conv_2d(resnet, out_channels, 1, 1, 'valid',
                                     activation, bias, weights_init, bias_init,
                                     regularizer, weight_decay, trainable,
                                     restore)

            # Downsampling
            if downsample_strides > 1:
                identity = avg_1x1pool_2d_all(identity, 1, downsample_strides)

            # Projection to new dimension
            current_channels = identity.get_shape().as_list()[-1]
            ch = (out_channels - current_channels)//2
            identity = tf.pad(identity, [[0, 0], [0, 0], [0, 0], [ch, ch]])

            resnet = resnet + identity

        # Track activations.
        tf.add_to_collection(tf.GraphKeys.ACTIVATIONS, resnet)

    # Add attributes to Tensor to easy access weights.
    resnet.scope = scope

    # Track output tensor.
    tf.add_to_collection(tf.GraphKeys.LAYER_TENSOR + '/' + name, resnet)

    return resnet

tflearn.residual_block = residual_block
tflearn.residual_bottleneck = residual_bottleneck

def residual_net(inputs, nb_filter, reputation_list, residual_mode='basic', activation='relu'):
    net = tflearn.conv_2d(inputs, nb_filter, 7, strides=2)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.max_pool_2d(net, 3, strides=2)

    assert residual_mode in ['basic', 'deep'], 'Residual mode should be basic/deep'
    for i, nb_block in enumerate(reputation_list):
        for j in range(nb_block):
            downsample = True if i != 0 and j == 0 else False

            if residual_mode == 'basic':
                net = tflearn.residual_block(net, 1, activation=activation,
                                             downsample=downsample)
            else:
                net = tflearn.residual_bottleneck(net, 1, nb_filter * 4,
                                                  activation=activation,
                                                  downsample=downsample)
    net = tflearn.batch_normalization(net)
    net = tflearn.activation(net, activation)
    net = tflearn.global_avg_pool(net)

    net = tflearn.fully_connected(net, nb_class, activation='softmax')
    return net

net = tflearn.input_data(shape=[None, 32, 32, 3], data_preprocessing=img_prep, data_augmentation=img_aug)
net = residual_net(net, nb_filter, reputation_list, residual_mode=residual_mode)
net = tflearn.regression(net, optimizer='adam', loss='categorical_crossentropy')
model = tflearn.DNN(net, checkpoint_path='model_resnet_cifar10',
                    max_checkpoints=10, tensorboard_verbose=0)

model.fit(X_train, y_train, n_epoch=200, validation_set=(X_test, y_test),
          snapshot_epoch=False, snapshot_step=500,
          show_metric=True, batch_size=128, shuffle=True,
          run_id='resnet_cifar10')

Résultat d'exécution

ValueError: Current implementation does not support strides in the batch and depth dimensions.

J'étais en colère contre TensorFlow pour avoir fait quelque chose de plus ... Je ne sais pas, mais puis-je faire quelque chose de similaire avec Maxout? Pour le moment, abandonne cette fois