[PYTHON] Investigate the relationship between TensorFlow and Keras in transition

([Difference in variable name control](http://qiita.com/TomokIshii/items/178938b6db1edc16b94e#%E8%BF%BD%E8%A8%98%E5%A4%89%E6%95%B0%E5% 90% 8D% E3% 82% B3% E3% 83% B3% E3% 83% 88% E3% 83% AD% E3% 83% BC% E3% 83% AB% E3% 81% AE% E9% 81% Added about 95% E3% 81% 84tfcontribkeras-vs-tflayers).)

Introduction

Information has been released at the TensorFlow Dev Summit, etc., but the integration of TensorFlow and Keras is in progress. Quoted from Keras Blog-Introducing Keras 2.

Keras is best understood as an API specification, not as a specific codebase. In fact, going fowards there will be two separate implementations of the Keras spec: the internal TensorFlow one, available as tf.keras, written in pure TensorFlow and deeply compatible with all TensorFlow functionality, and the external multi-backend one supporting both Theano and TensorFlow (and likely even more backends in the future).

Keras will be divided into two implementations, one for TensorFlow integration and the other for multi-backend (Theno, TensorFlow, etc.) as an independent package. .. I used to switch to Theano / TensorFlow as appropriate, but recently I have chosen TensorFlow backend. So, of the two Keras packages, I'm more interested in the TensorFlow integrated version.

The following schedule was announced at the Dev Summit the other day. (Excerpt from YouTube)

As mentioned above, it seems that we are planning to take steps such as "tf.contrib.keras" and "tf.keras" for integration. Since TensorFlow 1.1 (1.1.0-rc1) has been released this time, I would like to install it immediately and check the contents.

(The programming environment is Python 3.5.2, TensorFlow 1.1.0-rc1, Keras 2.0.2.)

Keras 2 (multi-backend version) has also been released

Keras 2 has already been released, and Qiita has an introductory article. (I saw it.) I also tried using it in the environment of TensorFlow 1.0 + Keras 2.0. The API hasn't changed much from the final version of Keras 1.0 (1.2.2?), But the key words (arguments, options, etc.) of some functions have changed in detail. The sample code for MNIST classification is as follows.

# Keras 2.0 + TensorFlow 1.0 backend
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import Adagrad
from keras.utils import np_utils

(Omitted)

model = Sequential()
model.add(Dense(512, input_dim=784))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(10))
model.add(Activation('softmax'))

(Omitted)

I've omitted a lot, but pay attention to the module import point (from keras.xxx).

Try tf.contrib.keras (in TensorFlow 1.1)

Now, let's try ** tf.contrib.keras ** released this time. I coded it by making changes from the above code (TF 1.0 + Keras 2.0).

import numpy as np

from tensorflow.contrib.keras.python.keras.datasets import mnist
from tensorflow.contrib.keras.python.keras.models import Sequential
from tensorflow.contrib.keras.python.keras.layers import Dense
from tensorflow.contrib.keras.python.keras.layers import Dropout, Activation
from tensorflow.contrib.keras.python.keras.optimizers import Adagrad
from tensorflow.contrib.keras.python.keras.utils import np_utils
from tensorflow.contrib.keras.python.keras import backend as K

def load_data(nb_classes=10):
    # the data, shuffled and split between tran and test sets
    (X_train, y_train), (X_test, y_test) = mnist.load_data()
    X_train = X_train.reshape(60000, 784)
    X_test = X_test.reshape(10000, 784)
    X_train = X_train.astype('float32')
    X_test = X_test.astype('float32')
    X_train /= 255
    X_test /= 255
    print(X_train.shape[0], 'train samples')
    print(X_test.shape[0], 'test samples')

    # convert class vectors to binary class matrices
    y_train = np_utils.to_categorical(y_train, nb_classes)
    y_test = np_utils.to_categorical(y_test, nb_classes)

    return X_train, y_train, X_test, y_test

def mk_model():
    model = Sequential()
    model.add(Dense(512, input_dim=784))
    model.add(Activation('relu'))
    model.add(Dropout(0.2))
    model.add(Dense(512))
    model.add(Activation('relu'))
    model.add(Dropout(0.2))
    model.add(Dense(10))
    model.add(Activation('softmax'))

    return model

In the first half of the code, the import point of the module changes (obviously). (The directory is deep.) Although it is a model creation part, there is no change in the contents (although I put it in the function "mk_model"). Looks like 100% compatible with the Keras 2.0 API (tf.contrib.keras).

Also, there were no major changes in the latter half of the code (below).

if __name__ == '__main__':
    np.random.seed(1337)        # for reproducibility
    batch_size = 128
    nb_epoch = 20

    X_train, y_train, X_test, y_test = load_data()
    model = mk_model()
    model.summary()             # check model configuration

    model.compile(loss='categorical_crossentropy',
                optimizer=Adagrad(),
                metrics=['accuracy'])
    
    model.fit(X_train, y_train,
          batch_size=batch_size, epochs=nb_epoch,
          verbose=1, 
          validation_data=(X_test, y_test))

    score = model.evaluate(X_test, y_test, verbose=0)
    print('\nTest score   : {:>.4f}'.format(score[0]))
    print('Test accuracy: {:>.4f}'.format(score[1]))

    K.clear_session()
    # This statement is fixed the condition of ...
    # Exception ignored in: <bound method BaseSession.__del__ of 
    # <tensorflow.python.client.session.Session object at 0x7fb79a3fa550>>
    # ...
    # AttributeError: 'NoneType' object has no attribute 'TF_NewStatus'
    #
    # TensorFlow issue: Exception ignored in BaseSession.__del__ #3388

It seems that there is no functional problem, so in the future (although I have not erased the Keras package from the disk), it seems that the Keras API can be used just by installing TensorFlow without installing the Keras package. (Previously, when I was in Keras 1.x.x, I had to pay attention to version synchronization between Keras and TensorFlow ...)

However, an error occurred when the program was terminated, so I had to investigate a little. It seems that problems may occur in the processing related to the release of computer resources. (Reproducibility is unknown, but in my environment, an error occurred most of the time.)

As mentioned above, it is a countermeasure by inserting the statement K.clear_session (). (I do not fully understand the cause of this Error and Bug. If you are interested, please refer to the related site.)

Use Keras as a Layer class library

This alone is Keras itself, so let's leave the Keras Model framework and use it as a Layer class library. This usage has been supported for a long time and is not particularly new, but I will try it with the latest version of the library (tf.contrib.keras). (The source is Keras Blog-Keras as a simplified interface to TensorFlow: tutorial )

import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

from tensorflow.contrib.keras.python import keras
from tensorflow.contrib.keras.python.keras import backend as K

def load_data():
    dirn = '../MNIST_data'
    mnist = input_data.read_data_sets(dirn, one_hot=True)

    print(mnist.train.num_examples, 'train samples')
    print(mnist.test.num_examples, 'test samples')
    print(mnist.validation.num_examples, 'validation samples (not used)')

    return mnist

def mlp_model(input):
    # MLP network model
    with tf.variable_scope('mlp_model'):
        x = keras.layers.Dense(units=512, activation='relu')(input)
        x = keras.layers.Dropout(0.2)(x)
        x = keras.layers.Dense(units=512, activation='relu')(x)
        x = keras.layers.Dropout(0.2)(x)
        y_pred = keras.layers.Dense(units=10, activation='softmax')(x)

    return y_pred

if __name__ == '__main__':
    mnist = load_data()
    # tensorflow placeholders
    x = tf.placeholder(tf.float32, [None, 784])
    y_ = tf.placeholder(tf.float32, [None, 10])
    # define TF graph
    y_pred = mlp_model(x)
    loss = tf.losses.softmax_cross_entropy(y_, y_pred)
    train_step = tf.train.AdagradOptimizer(0.05).minimize(loss)
    correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    init = tf.global_variables_initializer()

    with tf.Session() as sess:
        sess.run(init)
        print('Training...')
        for i in range(10001):
            batch_xs, batch_ys = mnist.train.next_batch(100)
            train_fd = {x: batch_xs, y_: batch_ys, K.learning_phase(): 1}
            train_step.run(feed_dict=train_fd)
            if i % 1000 == 0:
                batch_xv, batch_yv = mnist.test.next_batch(200)
                val_accuracy = accuracy.eval(
                    {x: batch_xv, y_: batch_yv, K.learning_phase(): 0})
                print('  step, accurary = %6d: %6.3f' % (i, val_accuracy))
    
        test_fd = {x: mnist.test.images, y_: mnist.test.labels, 
                    K.learning_phase(): 0}
        test_accuracy = accuracy.eval(feed_dict=test_fd)
        print('Test accuracy:', test_accuracy)

I was able to write the classification code based on the MNIST MLP (Multi-layer Perceptron) model as "clean". (Compared to the "plain" TensorFlow code that does not use the HighLevel API, it means "clean".)

The advantage of using Keras as a Layer class library is that it sets the default value "appropriately". This point was emphasized in the presentation at the TensorFlow Dev Summit as "an accessible high-level API with good defaults". In the above code, the variable initializer is not set in detail in the full bond layer Dense (), but the initializer of'glorot_uniform'(Xavier uniform) is applied to Weight, and'zeros' to Bias. The initializer is applied. (It seems that this parameter initialization method is often used in recent Neural Network sample code.) (The feed_dict setting of K.learning_phase () controls the behavior of Dropout. For details, refer to Keras Blog "Keras as a simplified ...".)

For the time being, I'm curious about how to use the Tensor variable, so check the name (variable name).

#    vars = tf.global_variables()
#    print('variables:')
#    for v in vars:
#        print(v)

variables:
<tf.Variable 'mlp_model/dense_1/kernel:0' shape=(784, 512) dtype=float32_ref>
<tf.Variable 'mlp_model/dense_1/bias:0' shape=(512,) dtype=float32_ref>
<tf.Variable 'mlp_model/dense_2/kernel:0' shape=(512, 512) dtype=float32_ref>
<tf.Variable 'mlp_model/dense_2/bias:0' shape=(512,) dtype=float32_ref>
<tf.Variable 'mlp_model/dense_3/kernel:0' shape=(512, 10) dtype=float32_ref>
<tf.Variable 'mlp_model/dense_3/bias:0' shape=(10,) dtype=float32_ref>
<tf.Variable 'mlp_model/dense_1/kernel/Adagrad:0' shape=(784, 512) dtype=float32_ref>
<tf.Variable 'mlp_model/dense_1/bias/Adagrad:0' shape=(512,) dtype=float32_ref>
<tf.Variable 'mlp_model/dense_2/kernel/Adagrad:0' shape=(512, 512) dtype=float32_ref>
<tf.Variable 'mlp_model/dense_2/bias/Adagrad:0' shape=(512,) dtype=float32_ref>
<tf.Variable 'mlp_model/dense_3/kernel/Adagrad:0' shape=(512, 10) dtype=float32_ref>
<tf.Variable 'mlp_model/dense_3/bias/Adagrad:0' shape=(10,) dtype=float32_ref>

Of the above 12 tf.Variables, the first 6 are the weight (kernel) and bias of the Dense layer, and the remaining 6 are the derived variables of the optimizer. The root'mlp_model' was named by myself in the code, but'dense_1 / kernel','dense_1 / bias', ... are automatically named by "tf.contrib.keras". It is. I searched the documentation because I wanted to decide the name of this Tensor variable myself, but it seems that it does not currently support user naming. (It may be a usage that deviates from the concept of concealing small details as much as possible to make it easier to use.)

If you want to access Tensor variables by "name & variable scope"

If you want to access the Tensor variable name or decide the variable name yourself for the purpose of weight sharing etc., it seems that it is better to move away from the Keras API.

import numpy as np
import tensorflow as tf
from tensorflow.python.layers import layers
from tensorflow.examples.tutorials.mnist import input_data
from sklearn.metrics import confusion_matrix

# Create n.n. model
def nn_model(images, drop_rate, vs, reuse=False):
    with tf.variable_scope(vs, reuse=reuse):
        net = tf.layers.dense(images, 512, activation=tf.nn.relu, name='dense1')
        net = tf.layers.dropout(net, rate=drop_rate)
        net = tf.layers.dense(net, 512, activation=tf.nn.relu, name='dense2')
        net = tf.layers.dropout(net, rate=drop_rate)
        net = tf.layers.dense(net, 10, activation=None, name='dense3')
   
    return net

x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])
keep_prob = tf.placeholder(tf.float32)

drop_rate = 1 - keep_prob
mlp1_pred = nn_model(x, drop_rate, 'mlp1')
mlp2_pred = nn_model(x, drop_rate, 'mlp1', reuse=True) 

loss = tf.losses.softmax_cross_entropy(y_, mlp1_pred)
train_step = tf.train.AdagradOptimizer(0.05).minimize(loss)
correct_prediction = tf.equal(tf.argmax(mlp1_pred, 1), tf.argmax(y_, 1))
accuracy1 = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

The above is coding using "tf.layers" supported by TensorFlow 1.0 or later. Here, you can control the variable name and variable scope yourself, and you can freely set shared variables with the reuse flag.

(Reference, Qiita's above)

-Understanding the TensorFlow namespace and mastering shared variables -I thought about the increasing API of TensorFlow

By the way, do you notice anything when you look at the above "tf.layers" based code?

Yes, the API of "tf.layers" and the API of "tf.contrib.keras" are very similar.

--The name of the fully connected layer is "dense" (rather than "fully_connected" etc.). --Use [kernel_initializer, bias_initializer, kernel_regularizer, bias_regularizer] etc. as options of Layer constructor. --The Dropout parameter specifies the percentage to invalidate, not the percentage to leave.

There are likely to be others.

In my above-mentioned "API with increasing TensorFlow ...", I wrote negatively to "tf.layers" as "API is not organized and organized", but in reality, Keras API The integration with was being considered. The change of the key word in Keras 2.0 is also supposed to be a change in consideration of "tf.layers" (I regret my thoughtlessness).

Finally

After a lot of research, we can expect future integration of TensorFlow and Keras, especially the next version, TensorFlow 1.2. (Is it around May-2017?) There is not much information on what kind of API it will be at the moment, but considering the relationship between the current independent package version of Keras 2 and "tf.contrib.keras", there is a big problem in terms of API continuity. I don't expect it to happen. For the time being, I would like to deepen my understanding of the current version while expecting higher functionality of TensorFlow and usability with "tf.keras".

(Addition) Difference in variable name control, "tf.contrib.keras" vs. "tf.layers"

When I introduced tf.conrib.keras above, I wrote that "Currently, it doesn't seem to support user naming.", But this was incorrect. It seems that you can specify the "name" option in the Layer definition. (Although it is not explained in the document, there was a part where "name" was used in the test code in github.) However, it seems that the variable is not reused by specifying "reuse". See the code below.

import numpy as np
import tensorflow as tf

from tensorflow.contrib.keras.python import keras
from tensorflow.contrib.keras.python.keras import backend as K
from tensorflow.python.layers import layers

def mlp_model_keras1(input):
    # MLP network model
    with tf.variable_scope('mlp_model'):
        x = keras.layers.Dense(units=512, activation='relu', name='my_dense1')(input)
        x = keras.layers.Dense(units=512, activation='relu', name='my_dense2')(x)
        y_pred = keras.layers.Dense(units=10, activation='softmax', name='my_softmax')(x)

    return y_pred

def mlp_model_keras2(input):
    # MLP network model
    with tf.variable_scope('mlp_model', reuse=True):
        x = keras.layers.Dense(units=512, activation='relu', name='my_dense1')(input)
        x = keras.layers.Dense(units=512, activation='relu', name='my_dense2')(x)
        y_pred = keras.layers.Dense(units=10, activation='softmax', name='my_softmax')(x)

    return y_pred

# Create the model
def mlp_model_by_layers1(input):
    with tf.variable_scope('mlp_by_tflayers'):
        net = tf.layers.dense(input, 512, activation=tf.nn.relu, name='his_dense1')
        net = tf.layers.dense(net, 512, activation=tf.nn.relu, name='his_dense2')
        net = tf.layers.dense(net, 10, activation=None, name='his_dense3')
   
    return net

def mlp_model_by_layers2(input):
    with tf.variable_scope('mlp_by_tflayers', reuse=True):
        net = tf.layers.dense(input, 512, activation=tf.nn.relu, name='his_dense1')
        net = tf.layers.dense(net, 512, activation=tf.nn.relu, name='his_dense2')
        net = tf.layers.dense(net, 10, activation=None, name='his_dense3')
   
    return net

if __name__ == '__main__':
    fake_data = np.ones([10, 784], dtype=np.float32) * 0.5
    x = tf.placeholder(tf.float32, [None, 784])

    # define TF graph
    y_pred1 = mlp_model_keras1(x)
    y_pred2 = mlp_model_keras2(x)
    y_pred3 = mlp_model_by_layers1(x)
    y_pred4 = mlp_model_by_layers2(x)
    
    init = tf.global_variables_initializer()

    with tf.Session() as sess:
        sess.run(init)

    vars = tf.global_variables()
    print('variables:')
    for v in vars:
        print(v)
'''
variables:
#First function"mlp_model_keras1"Variables defined in
<tf.Variable 'mlp_model/my_dense1/kernel:0' shape=(784, 512) dtype=float32_ref>
<tf.Variable 'mlp_model/my_dense1/bias:0' shape=(512,) dtype=float32_ref>
<tf.Variable 'mlp_model/my_dense2/kernel:0' shape=(512, 512) dtype=float32_ref>
<tf.Variable 'mlp_model/my_dense2/bias:0' shape=(512,) dtype=float32_ref>
<tf.Variable 'mlp_model/my_softmax/kernel:0' shape=(512, 10) dtype=float32_ref>
<tf.Variable 'mlp_model/my_softmax/bias:0' shape=(10,) dtype=float32_ref>

#Second function"mlp_model_keras2"Variables defined in
<tf.Variable 'mlp_model_1/my_dense1/kernel:0' shape=(784, 512) dtype=float32_ref>
<tf.Variable 'mlp_model_1/my_dense1/bias:0' shape=(512,) dtype=float32_ref>
<tf.Variable 'mlp_model_1/my_dense2/kernel:0' shape=(512, 512) dtype=float32_ref>
<tf.Variable 'mlp_model_1/my_dense2/bias:0' shape=(512,) dtype=float32_ref>
<tf.Variable 'mlp_model_1/my_softmax/kernel:0' shape=(512, 10) dtype=float32_ref>
<tf.Variable 'mlp_model_1/my_softmax/bias:0' shape=(10,) dtype=float32_ref>

#Third function"mlp_model_by_layers1"Variables defined in
<tf.Variable 'mlp_by_tflayers/his_dense1/kernel:0' shape=(784, 512) dtype=float32_ref>
<tf.Variable 'mlp_by_tflayers/his_dense1/bias:0' shape=(512,) dtype=float32_ref>
<tf.Variable 'mlp_by_tflayers/his_dense2/kernel:0' shape=(512, 512) dtype=float32_ref>
<tf.Variable 'mlp_by_tflayers/his_dense2/bias:0' shape=(512,) dtype=float32_ref>
<tf.Variable 'mlp_by_tflayers/his_dense3/kernel:0' shape=(512, 10) dtype=float32_ref>
<tf.Variable 'mlp_by_tflayers/his_dense3/bias:0' shape=(10,) dtype=float32_ref>

#4th function"mlp_model_by_layers2"The variables defined in are not found (otherwise).
'''

Of the four models, the first two use tf.contrib.keras. When "name" is specified in "keras.layers.Dense", the name is reflected in the variable name. In the second model, I set the same variable scope (same variable name) and added reuse = True to tf.variable_scope, but this was ignored and the variable scope name was automatically changed from'mlp_model' to'mlp_model_1'. The variable name was given by converting it.

On the other hand, the result of doing the same for the 3rd and 4th tf.layers models does not cause the automatic change of the scope name, and the result is as shown in the comment part of the above code. (Variables are reserved only for the one defined in the 3rd function, and no new variables are reserved for the 4th function. I have not confirmed the contents properly, but as expected, the variables are re-allocated. It is probable that the usage and shared variables have been set.

I don't think anyone uses the tf.contrib.keras library and tf.layers together, but the behavior is different, so I think it's a ** "danger of mixing" **. (I think that the specifications seen above may change in future versions.)

References, web site

Keras, Documentation
Integrating Keras & TensorFlow: The Keras workflow, expanded (TensorFlow Dev Summit 2017)
https://youtu.be/UeheTiBJ0Io
Introduction Keras 2 - The Keras Blog
https://blog.keras.io/introducing-keras-2.html
Keras as a simplified interface to TensorFlow: tutorial - The Keras Blog https://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html --TensorFlow, github (The document of tf.contrib.keras was hardly found. → TensorFlow document API 1.1 was released during the writing of this article.) https://www.tensorflow.org/versions/r1.1/api_docs/python/tf/contrib/keras
TensorFlow, issue #3388 - Bug: Exception ignored in BaseSession.__del__ https://github.com/tensorflow/tensorflow/issues/3388 --Keras has been updated to 2.0. --Qiita http://qiita.com/cvusk/items/aa6270301ff2d14fb989 --Miscellaneous commentary on TensorFlow's High Level API --Qiita http://qiita.com/rindai87/items/72651c702e9265595047` --Understanding the TensorFlow namespace and mastering shared variables --Qiita http://qiita.com/TomokIshii/items/ffe999b3e1a506c396c8 --I thought a little about the increasing API of TensorFlow --Qiita http://qiita.com/TomokIshii/items/554b0bd5f328b1bd5210