[PYTHON] TensorFlow To learn from a large number of images ... (Unsolved problem) → 12/18 Solved

Introduction

I looked at my posts up to the last time, but it's very hard to see ... I have to devise a way of writing little by little. Hi, this is ikki. Preferred Networks, which I saw at the Robot Exhibition the other day, developed it in collaboration with FANUC. " The blog about "Learning from 0 of robots piled up in pieces by deep learning" (https://research.preferred.jp/2015/12/robot_binpick_deep_learning/) "has been updated. Or rather, it was already posted when I posted it last time. .. .. I was listening to the story at the workshop, and I was surprised to hear that this robot was made in 3 months. President Nishikawa said, "It would have taken three years for an ordinary person (laughs)", but I don't have the ability to do it even if it takes three years.

So, the introduction has become long, but this time, following the previous article, it is about the problem of "Out of GPU Memory". ..

problem

From the advice of ReoAoki and MATS to the previous article, I was able to create a program that learns images and recognizes them without throwing an error, but the number of images that can be fed is small and it is not good enough. The recognition accuracy is not good. Should I devise an image (data set) to be trained? I also think that I would like to solve it in the program if possible. In the future, I want to create learning data without throwing errors no matter how many images are eaten! I think. (Although there is a limit to "how much")

program

The current program is as follows. This time as well, I will refer to Identifying the production company of anime Yuruyuri with the kivantium activity diary TensorFlow. Thank you very much.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import cv2
import numpy as np
import tensorflow as tf
import tensorflow.python.platform

NUM_CLASSES = 2
IMAGE_SIZE = 28
IMAGE_PIXELS = IMAGE_SIZE*IMAGE_SIZE*3

flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_string('save_model', 'models/model.ckpt', 'File name of model data')
flags.DEFINE_string('train', 'training/train_.txt', 'File name of train data')
flags.DEFINE_string('test', 'training/test_.txt', 'File name of test data')
flags.DEFINE_string('train_dir', '/tmp/pict_data', 'Directory to put the training data.')
flags.DEFINE_integer('max_steps', 200, 'Number of steps to run trainer.')
flags.DEFINE_integer('batch_size', 128, 'Batch size'
                     'Must divide evenly into the dataset sizes.')
flags.DEFINE_float('learning_rate', 1e-4, 'Initial learning rate.')

def inference(images_placeholder, keep_prob):
    ####################################################################
    #Functions that create predictive models
    #argument: 
    #  images_placeholder:Image placeholder
    #  keep_prob:dropout rate placeholder
    #Return value:
    #  y_conv:Probability of each class(Something like)
    ####################################################################

    #Weight with standard deviation 0.Initialized with a normal distribution of 1
    def weight_variable(shape):
      initial = tf.truncated_normal(shape, stddev=0.1)
      return tf.Variable(initial)

    #Bias standard deviation 0.Initialized with a normal distribution of 1
    def bias_variable(shape):
      initial = tf.constant(0.1, shape=shape)
      return tf.Variable(initial)

    #Creating a convolution layer
    def conv2d(x, W):
      return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

    #Creating a pooling layer
    def max_pool_2x2(x):
      return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                            strides=[1, 2, 2, 1], padding='SAME')
    
    #Transform input to 28x28x3
    x_images = tf.reshape(images_placeholder, [-1, IMAGE_SIZE, IMAGE_SIZE, 3])

    #Creation of convolution layer 1
    with tf.name_scope('conv1') as scope:
        W_conv1 = weight_variable([5, 5, 3, 32])
        b_conv1 = bias_variable([32])
        h_conv1 = tf.nn.relu(conv2d(x_images, W_conv1) + b_conv1)

    #Creation of pooling layer 1
    with tf.name_scope('pool1') as scope:
        h_pool1 = max_pool_2x2(h_conv1)
    
    #Creation of convolution layer 2
    with tf.name_scope('conv2') as scope:
        W_conv2 = weight_variable([5, 5, 32, 64])
        b_conv2 = bias_variable([64])
        h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)

    #Creation of pooling layer 2
    with tf.name_scope('pool2') as scope:
        h_pool2 = max_pool_2x2(h_conv2)

    #Creation of fully connected layer 1
    with tf.name_scope('fc1') as scope:
        W_fc1 = weight_variable([7*7*64, 1024])
        b_fc1 = bias_variable([1024])
        h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
        h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
        #dropout settings
        h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    #Creation of fully connected layer 2
    with tf.name_scope('fc2') as scope:
        W_fc2 = weight_variable([1024, NUM_CLASSES])
        b_fc2 = bias_variable([NUM_CLASSES])

    #Normalization with softmax function
    with tf.name_scope('softmax') as scope:
        y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

    #Returns something like the probability of each label
    return y_conv

def loss(logits, labels):
    ####################################################################
    #Function to calculate loss
    #argument:
    #  logits:Logit tensor, float - [batch_size, NUM_CLASSES]
    #  labels:Label tensor, int32 - [batch_size, NUM_CLASSES]
    #Return value:
    #  cross_entropy:Cross entropy tensor, float
    ####################################################################

    #Calculation of cross entropy
    cross_entropy = -tf.reduce_sum(labels*tf.log(logits))
    #Specify to display in TensorBoard
    tf.scalar_summary("cross_entropy", cross_entropy)
    return cross_entropy

def training(loss, learning_rate):
    ####################################################################
    #Functions that define training ops
    #argument:
    #  loss:Loss tensor, loss()Result of
    #  learning_rate:Learning coefficient
    #Return value:
    #  train_step:Training op
    ####################################################################

    train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss)
    return train_step

def accuracy(logits, labels):
    ####################################################################
    #Correct answer rate(accuracy)Function to calculate
    #argument: 
    #  logits: inference()Result of
    #  labels:Label tensor, int32 - [batch_size, NUM_CLASSES]
    #Return value:
    #  accuracy:Correct answer rate(float)
    ####################################################################
    correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
    tf.scalar_summary("accuracy", accuracy)
    return accuracy

if __name__ == '__main__':
    #Open file
    with open(FLAGS.train, 'r') as f: # train.txt
        train_image = []
        train_label = []
        for line in f:
            line = line.rstrip()
            l = line.split()
            img = cv2.imread(l[0])
            img = cv2.resize(img, (IMAGE_SIZE, IMAGE_SIZE))
            train_image.append(img.flatten().astype(np.float32)/255.0)
            tmp = np.zeros(NUM_CLASSES)
            tmp[int(l[1])] = 1
            train_label.append(tmp)
        train_image = np.asarray(train_image)
        train_label = np.asarray(train_label)
        print len(train_image)

    with open(FLAGS.test, 'r') as f: # test.txt
        test_image = []
        test_label = []
        for line in f:
            line = line.rstrip()
            l = line.split()
            img = cv2.imread(l[0])
            img = cv2.resize(img, (IMAGE_SIZE, IMAGE_SIZE))
            test_image.append(img.flatten().astype(np.float32)/255.0)
            tmp = np.zeros(NUM_CLASSES)
            tmp[int(l[1])] = 1
            test_label.append(tmp)
        test_image = np.asarray(test_image)
        test_label = np.asarray(test_label)
    
    with tf.Graph().as_default():
        images_placeholder = tf.placeholder("float", shape=(None, IMAGE_PIXELS))
        labels_placeholder = tf.placeholder("float", shape=(None, NUM_CLASSES))
        keep_prob = tf.placeholder("float")

        logits = inference(images_placeholder, keep_prob)
        loss_value = loss(logits, labels_placeholder)
        train_op = training(loss_value, FLAGS.learning_rate)
        acc = accuracy(logits, labels_placeholder)

        #Ready to save
        saver = tf.train.Saver()
        sess = tf.Session()
        sess.run(tf.initialize_all_variables())
        #Setting the value to be displayed on TensorBoard
        summary_op = tf.merge_all_summaries()
        summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, sess.graph_def)
        
        for step in range(FLAGS.max_steps):
            for i in range(len(train_image)/FLAGS.batch_size):
                batch = FLAGS.batch_size*i
                sess.run(train_op, feed_dict={
                  images_placeholder: train_image[batch:batch+FLAGS.batch_size],
                  labels_placeholder: train_label[batch:batch+FLAGS.batch_size],
                  keep_prob: 0.5})

            train_accuracy = sess.run(acc, feed_dict={
                images_placeholder: train_image,
                labels_placeholder: train_label,
                keep_prob: 1.0})
            print "step %d, training accuracy %g"%(step, train_accuracy)

            summary_str = sess.run(summary_op, feed_dict={
                images_placeholder: train_image,
                labels_placeholder: train_label,
                keep_prob: 1.0})
            summary_writer.add_summary(summary_str, step)

    print "test accuracy %g"%sess.run(acc, feed_dict={
        images_placeholder: test_image,
        labels_placeholder: test_label,
        keep_prob: 1.0})

    #Save the final model
    save_path = saver.save(sess, FLAGS.save_model)

As for the contents of the program ・ Learning by reading 2 classes of images from train.txt and test.txt ・ The size of the image varies, but it has been resized to 28 * 28 by resize. -Output the learned result to a file called model.ckpt "GeForce GTX TITAN X 12GB" is used for GPU.

I also saw a thread that this problem is a memory allocator problem and can be solved by updating TensorFlow to the latest version, but it does not solve it after all. ・ ・ ・

How can I feed 20,000 or 30,000 images? We will continue to investigate. .. ..

** Probably solved! !! ** → TensorFlow To learn from a large number of images ... ~ (almost) solution ~

Recommended Posts

TensorFlow To learn from a large number of images ... (Unsolved problem) → 12/18 Solved
TensorFlow To learn from a large number of images ... ~ (almost) solution ~
Upload a large number of images to Wordpress
Learn how to inflate images from TensorFlow code
I want to solve the problem of memory leak when outputting a large number of images with Matplotlib
Convert a large number of PDF files to text files using pdfminer
Let Code Day6 Starting from Zero "1342. Number of Steps to Reduce a Number to Zero"
One-liner to create a large number of test files at once on Linux
Organize a large number of files into folders
A tool to follow posters with a large number of likes on instagram [25 minutes to 1 second]
Create a dataset of images to use for learning
Accelerate a large number of simple queries with MySQL
I want to detect images of cats from Instagram
Installation of TensorFlow, a machine learning library from Google
[Python] Randomly generate a large number of English names
I tried to find the trend of the number of ships in Tokyo Bay from satellite images.
Paste a large number of image files into PowerPoint [python-pptx]
I want to start a lot of processes from python
How to increase the number of machine learning dataset images
Scrapy-Redis is recommended for crawling a large number of domains
How to get a list of links from a page from wikipedia
[Django] What to do if the model you want to create has a large number of fields
Creating a Tensorflow Sequential model with original images added to MNIST
Executing a large number of Python3 Executor.submit may consume a lot of memory
From a book that programmers can learn ... (Python): Review of arrays
How to create a large amount of test data in MySQL? ??
Run Systems Manager from Lambda to get a backup of EC2
DataFrame of pandas From creating a DataFrame from two lists to writing a file
From a book that programmers can learn ... Collect small problem parts
[TensorFlow 2.x compatible version] How to train a large amount of data using TFRecord & DataSet in TensorFlow (Keras)
I want to backtest a large number of exchange pairs and strategies at once with Python's backtesting.py
A story about creating a program that will increase the number of Instagram followers from 0 to 700 in a week