Please point out any mistakes.
One of unsupervised learning, which allows the same data as the input data to be obtained from the output of the NN. As shown in the figure above, the model is cramped, and only the middle layer is actually used. Roughly speaking, the purpose of this model is to extract features for the input.
RNN AutoEncoder The RNN AutoEncoder is modeled as above and uses the last state of the Encoder's RNN Cell. Unlike the basic AutoEncoder, RNN AutoEncoder aims to extract features from sequential data. Therefore, it is possible to extract features from videos and texts.
model.py
import tensorflow as tf
import tensorflow.contrib.seq2seq as seq2seq
This time, the decoder part uses the seq2seq module.
model.py
with tf.variable_scope('LSTM_ENCODER'):
cell_ = tf.contrib.rnn.BasicLSTMCell(args.rnn_size, reuse=tf.get_variable_scope().reuse)
encoder_outputs, encoder_final_state = tf.nn.dynamic_rnn(cell_, encoder_inputs,\
initial_state=cell_.zero_state(batch_size=args.batch_size, dtype=tf.float32),\
dtype=tf.float32)
The encoder part looks like this. ʻEncoder_final_state` is the part that corresponds to the purpose.
model.py
with tf.variable_scope('LSTM_DECODER'):
cell_ = tf.contrib.rnn.BasicLSTMCell(args.rnn_size, reuse=tf.get_variable_scope().reuse)
helper = seq2seq.TrainingHelper(decoder_input, tf.cast([args.max_time_step]*args.batch_size, tf.int32))
decoder = seq2seq.BasicDecoder(cell_, helper=helper, initial_state=encoder_final_state)
basic_decoder_out, _, _ = seq2seq.dynamic_decode(decoder= decoder)
decoder_output, _ = basic_decoder_out
The decoder part can be written like this.
model.py
with tf.variable_scope('LSTM_DECODER_LOSS'):
out_w = tf.get_variable('decoder_out_w', shape=[args.rnn_size, args.vocab_size], dtype=tf.float32, initializer=tf.random_normal_initializer)
out_b = tf.get_variable('decoder_out_b', shape=[args.vocab_size], dtype=tf.float32, initializer=tf.random_normal_initializer)
outs = []
time_step_major = tf.unstack(tf.transpose(decoder_output, [1, 0, 2]))
for i, t_out in enumerate(time_step_major):
outs.append(tf.nn.relu(tf.nn.xw_plus_b(t_out, out_w, out_b, 'dense')))
logits = tf.transpose(tf.convert_to_tensor(outs), [1, 0, 2])
self.loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.reshape(self.decoder_labels, [-1, args.vocab_size]), logits=tf.reshape(logits, [-1, args.vocab_size])))
loss can be written as above. After that, it is ok if you feed each variable from Session and optimize as usual.
Click here for the paper LSTM AE paper Learn Autoencoder with Keras