[PYTHON] RNN AutoEncoder

Please point out any mistakes.

What is AutoEncoder in the first place?

One of unsupervised learning, which allows the same data as the input data to be obtained from the output of the NN. auto_encoder_model_from_wiki As shown in the figure above, the model is cramped, and only the middle layer is actually used. Roughly speaking, the purpose of this model is to extract features for the input.

RNN AutoEncoder Screenshot from 2017-07-16 23-16-44.png The RNN AutoEncoder is modeled as above and uses the last state of the Encoder's RNN Cell. Unlike the basic AutoEncoder, RNN AutoEncoder aims to extract features from sequential data. Therefore, it is possible to extract features from videos and texts.

Implementation in Tensorflow

model.py


import tensorflow as tf
import tensorflow.contrib.seq2seq as seq2seq

This time, the decoder part uses the seq2seq module.

model.py


with tf.variable_scope('LSTM_ENCODER'):
    cell_ = tf.contrib.rnn.BasicLSTMCell(args.rnn_size, reuse=tf.get_variable_scope().reuse)
    encoder_outputs, encoder_final_state = tf.nn.dynamic_rnn(cell_, encoder_inputs,\                                                             
                                                             initial_state=cell_.zero_state(batch_size=args.batch_size, dtype=tf.float32),\
                                                             dtype=tf.float32)

The encoder part looks like this. ʻEncoder_final_state` is the part that corresponds to the purpose.

model.py


with tf.variable_scope('LSTM_DECODER'):
    cell_ = tf.contrib.rnn.BasicLSTMCell(args.rnn_size, reuse=tf.get_variable_scope().reuse)
    helper = seq2seq.TrainingHelper(decoder_input, tf.cast([args.max_time_step]*args.batch_size, tf.int32))
    decoder = seq2seq.BasicDecoder(cell_, helper=helper, initial_state=encoder_final_state)
    basic_decoder_out, _, _ =  seq2seq.dynamic_decode(decoder= decoder)
    decoder_output, _ = basic_decoder_out

The decoder part can be written like this.

model.py


with tf.variable_scope('LSTM_DECODER_LOSS'):
    out_w = tf.get_variable('decoder_out_w', shape=[args.rnn_size, args.vocab_size], dtype=tf.float32, initializer=tf.random_normal_initializer)
    out_b = tf.get_variable('decoder_out_b', shape=[args.vocab_size], dtype=tf.float32, initializer=tf.random_normal_initializer)

    outs = []
    time_step_major = tf.unstack(tf.transpose(decoder_output, [1, 0, 2]))
    for i, t_out in enumerate(time_step_major):
        outs.append(tf.nn.relu(tf.nn.xw_plus_b(t_out, out_w, out_b, 'dense')))
    logits = tf.transpose(tf.convert_to_tensor(outs), [1, 0, 2]) 
    self.loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.reshape(self.decoder_labels, [-1, args.vocab_size]), logits=tf.reshape(logits, [-1, args.vocab_size])))

loss can be written as above. After that, it is ok if you feed each variable from Session and optimize as usual.

reference

Click here for the paper LSTM AE paper Learn Autoencoder with Keras

Recommended Posts

RNN AutoEncoder
Practice RNN TensorFlow