I tried to decompose the data reading mechanism from a file in TensorFlow as a series of flows. I think that it will be a reference for how to take in the binary image data of CIFAR-10, utilize Queue, and send it to the graph as a tensor in Session.
I found out ・ As stated in the official Reading data, reading from a file is performed in 7 steps. -The reason why FilenameQueue is bitten is to shuffle data and execute processing in multiple threads. ・ The following structure is used
tf.Graph().as_default() sess=tf.Session() tf.train.start_queue_runners(sess=sess) for i in range(): sess.run([ .. ]) ``` Etc. Also, [Try using TensorFlow's Reader class](http://qiita.com/knok/items/2dd15189cbca5f9890c5) explains the most important part of how to handle jpeg images, so please also refer to it. Please refer. It is assumed that the data of cifar10 is saved as /tmp/cifar10_data/ .. If you run the following code, the image data will be output as a tensor. This script extracts the basic parts of data loading and preprocessing from the large number of functions in the cifar10 tutorial. See cifar10_input.py for more processing. #### **`tensorflow_experiment3.py`** ```py #coding:utf-8 #Until the Cifar10 image file is read and converted to a tensor. import tensorflow as tf FLAGS = tf.app.flags.FLAGS tf.app.flags.DEFINE_integer('max_steps', 1, """Number of batches to run.""") tf.app.flags.DEFINE_integer('batch_size', 128, """Number of images to process in a batch.""") NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN = 50000 with tf.Graph().as_default(): # 1.List of file names filenames = ['/tmp/cifar10_data/cifar-10-batches-bin/data_batch_1.bin', '/tmp/cifar10_data/cifar-10-batches-bin/data_batch_2.bin', '/tmp/cifar10_data/cifar-10-batches-bin/data_batch_3.bin', '/tmp/cifar10_data/cifar-10-batches-bin/data_batch_4.bin', '/tmp/cifar10_data/cifar-10-batches-bin/data_batch_5.bin'] # 2.No filename shuffle # 3.No epoch limit setting # 4.Creating a "filename list" queue filename_queue = tf.train.string_input_producer(filenames) # 5.Creating a reader that matches the data format class CIFAR10Record(object): pass result = CIFAR10Record() label_bytes = 1 result.height = 32 result.width = 32 result.depth = 3 image_bytes = result.height * result.width * result.depth record_bytes = label_bytes + image_bytes reader = tf.FixedLengthRecordReader(record_bytes=record_bytes) ##Open the file by passing the queue to the reader result.key, value = reader.read(filename_queue) # 6.decode the data from the read result record_bytes = tf.decode_raw(value, tf.uint8) # 7.Data shaping # 7-1.Basic plastic surgery result.label = tf.cast(tf.slice(record_bytes, , [label_bytes]), tf.int32) depth_major = tf.reshape(tf.slice(record_bytes, [label_bytes], [image_bytes]), [result.depth, result.height, result.width]) result.uint8image = tf.transpose(depth_major, [1, 2, 0]) read_input = result reshaped_image = tf.cast(read_input.uint8image, tf.float32) float_image = reshaped_image # 7-2.Preparing to shuffle data min_fraction_of_examples_in_queue = 0.4 min_queue_examples = int(NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN * min_fraction_of_examples_in_queue) print ('Filling queue with %d CIFAR images before starting to train. ' 'This will take a few minutes.' % min_queue_examples) # 7-3.Creating a batch(With shuffle) batch_size = FLAGS.batch_size num_preprocess_threads = 16 images, label_batch = tf.train.shuffle_batch( [float_image, read_input.label], batch_size=batch_size, num_threads=num_preprocess_threads, capacity=min_queue_examples + 3 * batch_size, min_after_dequeue=min_queue_examples) images=images labels = tf.reshape(label_batch, [batch_size]) # 8.Run sess = tf.Session() tf.train.start_queue_runners(sess=sess) for step in xrange(FLAGS.max_steps): img_label = sess.run([images, labels]) print(img_label) print("FIN.") ```