[PYTHON] Reading data with TensorFlow

I tried to decompose the data reading mechanism from a file in TensorFlow as a series of flows. I think that it will be a reference for how to take in the binary image data of CIFAR-10, utilize Queue, and send it to the graph as a tensor in Session.

I found out ・ As stated in the official Reading data, reading from a file is performed in 7 steps. -The reason why FilenameQueue is bitten is to shuffle data and execute processing in multiple threads. ・ The following structure is used



    for i in range():
        sess.run([ .. ])


 Also, [Try using TensorFlow's Reader class](http://qiita.com/knok/items/2dd15189cbca5f9890c5) explains the most important part of how to handle jpeg images, so please also refer to it. Please refer.

 It is assumed that the data of cifar10 is saved as /tmp/cifar10_data/ .. If you run the following code, the image data will be output as a tensor.
 This script extracts the basic parts of data loading and preprocessing from the large number of functions in the cifar10 tutorial. See cifar10_input.py for more processing.

#### **`tensorflow_experiment3.py`**

#Until the Cifar10 image file is read and converted to a tensor.
import tensorflow as tf

FLAGS = tf.app.flags.FLAGS
tf.app.flags.DEFINE_integer('max_steps', 1,
                            """Number of batches to run.""")
tf.app.flags.DEFINE_integer('batch_size', 128,
                            """Number of images to process in a batch.""")

with tf.Graph().as_default(): 
	# 1.List of file names
	filenames = ['/tmp/cifar10_data/cifar-10-batches-bin/data_batch_1.bin',
    # 2.No filename shuffle
    # 3.No epoch limit setting

    # 4.Creating a "filename list" queue
	filename_queue = tf.train.string_input_producer(filenames)

	# 5.Creating a reader that matches the data format
	class CIFAR10Record(object):
	result = CIFAR10Record()

	label_bytes = 1 
	result.height = 32
	result.width = 32
	result.depth = 3
	image_bytes = result.height * result.width * result.depth
	record_bytes = label_bytes + image_bytes

	reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)

	##Open the file by passing the queue to the reader
	result.key, value = reader.read(filename_queue)

	# 6.decode the data from the read result
	record_bytes = tf.decode_raw(value, tf.uint8)

    # 7.Data shaping
    # 7-1.Basic plastic surgery
	result.label = tf.cast(tf.slice(record_bytes, [0], [label_bytes]), tf.int32)
	depth_major = tf.reshape(tf.slice(record_bytes, [label_bytes], [image_bytes]),
                                [result.depth, result.height, result.width])
	result.uint8image = tf.transpose(depth_major, [1, 2, 0])

	read_input = result
	reshaped_image = tf.cast(read_input.uint8image, tf.float32)
	float_image = reshaped_image

	# 7-2.Preparing to shuffle data
	min_fraction_of_examples_in_queue = 0.4
	min_queue_examples = int(NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN *
	print ('Filling queue with %d CIFAR images before starting to train. '
            'This will take a few minutes.' % min_queue_examples)

    # 7-3.Creating a batch(With shuffle)
	batch_size = FLAGS.batch_size
	num_preprocess_threads = 16
	images, label_batch = tf.train.shuffle_batch(
	[float_image, read_input.label],
        capacity=min_queue_examples + 3 * batch_size,

	labels = tf.reshape(label_batch, [batch_size])

	# 8.Run
	sess = tf.Session()
	for step in xrange(FLAGS.max_steps):
		img_label = sess.run([images, labels])

Recommended Posts

Reading data with TensorFlow
Try data parallelism with Distributed TensorFlow
Zundokokiyoshi with TensorFlow
Breakout with Tensorflow
Learn data distributed with TensorFlow Y = 2X
Data analysis with python 2
Visualize data with Streamlit
Kyotei forecast with TensorFlow
Data visualization with pandas
Data manipulation with Pandas!
Shuffle data with pandas
Data Augmentation with openCV
Normarize data with Scipy
Try regression with TensorFlow
Data analysis with Python
LOAD DATA with PyMysql
Be careful when reading data with pandas (specify dtype)
Reading Note: An Introduction to Data Analysis with Python
Challenge image classification with TensorFlow2 + Keras 3 ~ Visualize MNIST data ~
Translate Getting Started With TensorFlow
Sample data created with python
Graph Excel data with matplotlib (1)
Try deep learning with TensorFlow
Use TensorFlow with Intellij IDEA
Artificial data generation with numpy
TensorFlow Tutorial-MNIST Data Download (Translation)
Extract Twitter data with CSV
Approximate sin function with TensorFlow
Tuning experiment of Tensorflow data
Get Youtube data with python
Clustering ID-POS data with LDA
Learn new data with PaintsChainer
Binarize photo data with OpenCV
Jetson Nano JETPACK 44.1 (2020/10/21) with Tensorflow
Easy image classification with TensorFlow
Graph Excel data with matplotlib (2)
Stock price forecast with tensorflow
Save tweet data with Django
Reading .txt files with Python
Try TensorFlow MNIST with RNN
Data processing tips with Pandas
Interpolate 2D data with scipy.interpolate.griddata
Read json data with python
Ensure reproducibility with tf.keras in Tensorflow 2.3
Save & load data with joblib, pickle
Customize Model / Layer / Metric with TensorFlow
Inference & result display with Tensorflow + matplotlib
Classify "Wine" with TensorFlow MLP code
[TensorFlow 2] Learn RNN with CTC Loss
Try deep learning with TensorFlow Part 2
How to deal with imbalanced data
How to deal with imbalanced data
[Python] Get economic data with DataReader
Versatile data plotting with pandas + matplotlib
[TensorFlow] [Keras] Neural network construction with Keras
Python data structures learned with chemoinformatics
Install the data files with setup.py
[Stock price analysis] Learning pandas with fictitious data (001: environment preparation-file reading)
Reading and writing NetCDF with Python
Parse pcap data with tshark command
Create noise-filled audio data with SoX