Announcement at DevFest Tokyo 2016 -happy-engineers-life) I received a request to see the whole sample code, so I will expose it.
DSL with raw TensorFlow and high-level API version of TensorFlow tf.contrib.learn and TensorFlow as backend With Keras that can describe the network like that, there was nothing that could be seen side by side as to how the same data and the same method would be used to make the difference. I wanted to. After all, when I looked at each tutorial, I was confused because I was doing something subtly different, or I was confused.
I didn't find anything that was compared under similar conditions in my research, so I decided to keep a record so that people would be less confused. For the time being, I will prepare other things based on the one in Tutorial of tf.contrib.learn.
-Use the flower (iris) data that often appears as an example in statistical analysis or machine learning called iris dataset.
--Borrowed from tf.contrib.learn.datasets.base.load_iris ()
--Middle layer 3, each layer is 10, 20, 30
--The activation function is ReLu
--The output layer is softmax
--Division of training data and test data is done with the help of sklearn.cross_validation ()
--Optimizer --Setting the initial value of the network
After that, I don't think there are many other things, but I feel that they are almost complete.
--Slim support
There seemed to be a request, but I haven't arrived due to time constraints.
Then it is a comparison immediately
――Since you do all the processes yourself, it looks good to understand that you are doing it properly. --The code in mnist tutorial has too many helper functions for it and is provided by TensorFlow itself. I can't tell what they're doing, and when I try to do it, there are many tutorial-only ones that are quite sad. ――So I prepared a helper function like that.
import tensorflow as tf
import numpy as np
from sklearn import cross_validation
# 1-function for hot vector generation
def one_hot_labels(labels):
return np.array([
np.where(labels == 0, [1], [0]),
np.where(labels == 1, [1], [0]),
np.where(labels == 2, [1], [0])
]).T
#Get data randomly with the specified batch size
def next_batch(data, label, batch_size):
perm = np.arange(data.shape[0])
np.random.shuffle(perm)
return data[perm][:batch_size], label[perm][:batch_size]
#Preparation of training data
iris = tf.contrib.learn.datasets.base.load_iris()
train_x, test_x, train_y, test_y = cross_validation.train_test_split(
iris.data, iris.target, test_size=0.2
)
#Input layer
x = tf.placeholder(tf.float32, [None, 4], name='input')
#1st layer
W1 = tf.Variable(tf.truncated_normal([4, 10], stddev=0.5, name='weight1'))
b1 = tf.Variable(tf.constant(0.0, shape=[10], name='bias1'))
h1 = tf.nn.relu(tf.matmul(x,W1) + b1)
#2nd layer
W2 = tf.Variable(tf.truncated_normal([10, 20], stddev=0.5, name='weight2'))
b2 = tf.Variable(tf.constant(0.0, shape=[20], name='bias2'))
h2 = tf.nn.relu(tf.matmul(h1,W2) + b2)
#Layer 3
W3 = tf.Variable(tf.truncated_normal([20, 10], stddev=0.5, name='weight3'))
b3 = tf.Variable(tf.constant(0.0, shape=[10], name='bias3'))
h3 = tf.nn.relu(tf.matmul(h2,W3) + b3)
#Output layer
W4 = tf.Variable(tf.truncated_normal([10, 3], stddev=0.5, name='weight4'))
b4 = tf.Variable(tf.constant(0.0, shape=[3], name='bias4'))
y = tf.nn.softmax(tf.matmul(h3,W4) + b4)
#Ideal output value
y_ = tf.placeholder(tf.float32, [None, 3], name='teacher_signal')
#Comparison with ideal output value
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
#What is called learning processing
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
for i in range(2000):
#Learning process
batch_size = 100
batch_train_x, batch_train_y = next_batch(train_x, train_y, batch_size)
sess.run(train_step, feed_dict={x: batch_train_x, y_: one_hot_labels(batch_train_y)})
#Evaluation of learning results
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: test_x, y_: one_hot_labels(test_y)}))
――Compared to raw TensorFlow, it's so easy that you can feel the rhythm!
-If you read API, it seems that you can set ʻoptimizer,
dropout` and various other things. , I think I can do more than I expected
import tensorflow as tf
from sklearn import cross_validation
#Preparation of training data
iris = tf.contrib.learn.datasets.base.load_iris()
train_x, test_x, train_y, test_y = cross_validation.train_test_split(
iris.data, iris.target, test_size=0.2
)
#Teach that all features are real numbers
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=4)]
#3-layer DNN
#If nothing is specified, the activation function seems to select ReLU.
classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20, 10],
n_classes=3,
model_dir="./iris_model")
#Model fitting
classifier.fit(x=train_x,
y=train_y,
steps=2000,
batch_size=50)
#Evaluation of accuracy
print(classifier.evaluate(x=test_x, y=test_y)["accuracy"])
――It's completely different ――But if you build a network experimentally, it looks much better than raw TensorFlow.
import tensorflow as tf
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from sklearn import cross_validation
#Preparation of input data
iris = tf.contrib.learn.datasets.base.load_iris()
train_x, test_x, train_y, test_y = cross_validation.train_test_split(
iris.data, iris.target, test_size=0.2
)
#Model definition
model = Sequential()
#Network definition
model.add(Dense(input_dim=4, output_dim=10))
model.add(Activation('relu'))
model.add(Dense(input_dim=10, output_dim=20))
model.add(Activation('relu'))
model.add(Dense(input_dim=20, output_dim=10))
model.add(Activation('relu'))
model.add(Dense(output_dim=3))
model.add(Activation('softmax'))
#Network compilation
model.compile(loss = 'sparse_categorical_crossentropy',
optimizer = 'sgd',
metrics = ['accuracy'])
#Learning process
model.fit(train_x, train_y, nb_epoch = 2000, batch_size = 100)
#Evaluation of learning results
loss, metrics = model.evaluate(test_x, test_y)
It looks like Pat looks like the following.
--If you work experimentally, Keras
――If you have decided what to do and can realize what you want to do with the provided API, tf.learn
--If you can't reproduce what you made with Keras
with tf.learn
, or if you already have an image of what you want to implement, raw TensorFlow (but debugging may be difficult)
--Deeper digging --Comparison with slim
Recommended Posts