[PYTHON] Explain the code of Tensorflow_in_ROS

Preface

result

Before we get into the detailed code description, I'll just show you what you can do with it .![Screenshot from 2016-11-23 15_44_07.png](https://qiita-image-store.s3.amazonaws. com / 0/134368 / b426cc2e-e963-0897-f2d0-f83a5ec7a3d0.png)

Handwritten 9 entered from the camera on the right like this The left is the estimation result from the trained CNN, and 9 is seen and 9 is returned properly. This time I wrote the code that allows CNN like this to run on ROS. Please refer to Previous Qiita article for details on execution method and preparation.

Outline of processing

If you review the outline of the process

  1. Configure CNN to read trained files
  2. Subscribe image information from the camera node
  3. Compress image information to 28 * 28 and binarize black and white so that it fits in MNIST CNN.
  4. Show the image to CNN and estimate the numbers
  5. Publish the result

It's like that.

Whole code

The whole code looks like this:

tensorflow_in_ros_mnist.py


import rospy
from sensor_msgs.msg import Image
from std_msgs.msg import Int16
from cv_bridge import CvBridge
import cv2
import numpy as np
import tensorflow as tf


def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], 
                      padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')


def makeCNN(x,keep_prob):
    # --- define CNN model
    W_conv1 = weight_variable([5, 5, 1, 32])
    b_conv1 = bias_variable([32])
    h_conv1 = tf.nn.relu(conv2d(x, W_conv1) + b_conv1)

    h_pool1 = max_pool_2x2(h_conv1)

    W_conv2 = weight_variable([3, 3, 32, 64])
    b_conv2 = bias_variable([64])
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)

    h_pool2 = max_pool_2x2(h_conv2)

    W_fc1 = weight_variable([7 * 7 * 64, 1024])
    b_fc1 = bias_variable([1024])
    h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
    
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    W_fc2 = weight_variable([1024, 10])
    b_fc2 = bias_variable([10])

    y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
 
    return y_conv



class RosTensorFlow():
    def __init__(self):
        self._cv_bridge = CvBridge()
        self._sub = rospy.Subscriber('image', Image, self.callback, queue_size=1)
        self._pub = rospy.Publisher('result', Int16, queue_size=1)

        self.x = tf.placeholder(tf.float32, [None,28,28,1], name="x")
        self.keep_prob = tf.placeholder("float")
        self.y_conv = makeCNN(self.x,self.keep_prob)

        self._saver = tf.train.Saver()
        self._session = tf.InteractiveSession()
        
        init_op = tf.initialize_all_variables()
        self._session.run(init_op)

        self._saver.restore(self._session, "model.ckpt")


    def callback(self, image_msg):
        cv_image = self._cv_bridge.imgmsg_to_cv2(image_msg, "bgr8")
        cv_image_gray = cv2.cvtColor(cv_image, cv2.COLOR_RGB2GRAY)
        ret,cv_image_binary = cv2.threshold(cv_image_gray,128,255,cv2.THRESH_BINARY_INV)
        cv_image_28 = cv2.resize(cv_image_binary,(28,28))
        np_image = np.reshape(cv_image_28,(1,28,28,1))
        predict_num = self._session.run(self.y_conv, feed_dict={self.x:np_image,self.keep_prob:1.0})
        answer = np.argmax(predict_num,1)
        rospy.loginfo('%d' % answer)
        self._pub.publish(answer)

    def main(self):
        rospy.spin()

if __name__ == '__main__':
    rospy.init_node('rostensorflow')
    tensor = RosTensorFlow()
    tensor.main()

Code commentary

** import part **

tensorflow_in_ros_mnist.py


import rospy
from sensor_msgs.msg import Image
from std_msgs.msg import Int16
from cv_bridge import CvBridge
import cv2
import numpy as np
import tensorflow as tf

This time I wanted to set up a ROS node in Python, so I added rospy. I also included Image for reading images, Int16 for exporting, cv_bridge for passing ROS message files to OpenCV, OpenCV, Numpy, and Tensorflow for machine learning.

** CNN definition part **

tensorflow_in_ros_mnist.py


def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], 
                      padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')


def makeCNN(x,keep_prob):
    # --- define CNN model
    W_conv1 = weight_variable([5, 5, 1, 32])
    b_conv1 = bias_variable([32])
    h_conv1 = tf.nn.relu(conv2d(x, W_conv1) + b_conv1)

    h_pool1 = max_pool_2x2(h_conv1)

    W_conv2 = weight_variable([3, 3, 32, 64])
    b_conv2 = bias_variable([64])
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)

    h_pool2 = max_pool_2x2(h_conv2)

    W_fc1 = weight_variable([7 * 7 * 64, 1024])
    b_fc1 = bias_variable([1024])
    h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
    
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    W_fc2 = weight_variable([1024, 10])
    b_fc2 = bias_variable([10])

    y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
 
    return y_conv

I made a function that defines CNN with the configuration according to Tensorflow Tutorial (Deep MNIST for Expert) .. The model is that after performing convolution pooling twice, a fully connected layer is inserted and the probability for each number is calculated.

** First half of init part of class **

tensorflow_in_ros_mnist.py


class RosTensorFlow():
    def __init__(self):
        self._cv_bridge = CvBridge()
        self._sub = rospy.Subscriber('image', Image, self.callback, queue_size=1)
        self._pub = rospy.Publisher('result', Int16, queue_size=1)

The first half is ROS processing. Here, the CvBridge function is called, Subscriber, and Publisher are defined. This time, the Subscriber receives the Image type message, and the Publisher delivers the Int16 type message. Basically as in the ROS example. By including callback in the argument of the sub part, the callback function is called every time an Image message is received.

** Second half of init part of class **

tensorflow_in_ros_mnist.py


        self.x = tf.placeholder(tf.float32, [None,28,28,1], name="x")
        self.keep_prob = tf.placeholder("float")
        self.y_conv = makeCNN(self.x,self.keep_prob)

        self._saver = tf.train.Saver()
        self._session = tf.InteractiveSession()
        
        init_op = tf.initialize_all_variables()
        self._session.run(init_op)

        self._saver.restore(self._session, "model.ckpt")

The second half is Tensorflow processing. First, we define x, which is the placeholder that contains the image, and keep_prob, which is the placeholder that contains the DropOut rate. A placeholder is like an entrance to data, and data enters here more and more at runtime.

Next, define the CNN used this time as y_conv. Imagewise, it defines the path that data comes out from the data entrance of x, keep_prob through CNN to the exit named y_conv.

After defining the route, prepare to actually flow the data with the session function. Initialize the CNN weight W and bias b once with the tf.initialize_all_variables function.

Read the learned parameters here. To read the parameters, you need to do saver.restore after using the tf.train.Saver function.

** callback part **

tensorflow_in_ros_mnist.py


    def callback(self, image_msg):
        cv_image = self._cv_bridge.imgmsg_to_cv2(image_msg, "bgr8")
        cv_image_gray = cv2.cvtColor(cv_image, cv2.COLOR_RGB2GRAY)
        ret,cv_image_binary = cv2.threshold(cv_image_gray,128,255,cv2.THRESH_BINARY_INV)
        cv_image_28 = cv2.resize(cv_image_binary,(28,28))
        np_image = np.reshape(cv_image_28,(1,28,28,1))
        predict_num = self._session.run(self.y_conv, feed_dict={self.x:np_image,self.keep_prob:1.0})
        answer = np.argmax(predict_num,1)
        rospy.loginfo('%d' % answer)
        self._pub.publish(answer)

This is read every time an image message comes in. After converting the message to an image with cv_bridge, it is grayscaled, binarized, black and white inverted, and size adjusted, and the image is thrown into CNN. The one with the highest probability of the returned estimation result predict_num is set as answer and published.

The main function is omitted.

Afterword

Now you can handle anything with ROS as long as you have a trained model of Tensorflow. I think FasterRCNN can do object recognition and face recognition, so I'd like to try it.

Recommended Posts

Explain the code of Tensorflow_in_ROS
Explain the mechanism of PEP557 data class
[Python3] Rewrite the code object of the function
[Python] Get the character code of the file
[Python] Read the source code of Bottle Part 2
Explain the nature of the multivariate normal distribution graphically
[Python] Read the source code of Bottle Part 1
The story of trying Sourcetrail × macOS × VS Code
Code for checking the operation of Python Matplotlib
Convert the character code of the file with Python3
The beginning of cif2cell
The meaning of self
The story of sys.path.append ()
Explain the associative array
Revenge of the Types: Revenge of types
Let's break down the basics of TensorFlow Python code
Get the return code of the Python script from bat
# Function that returns the character code of a string
I tried running the sample code of the Ansible module
Align the version of chromedriver_binary
Scraping the result of "Schedule-kun"
10. Counting the number of lines
Code that sets the default value in case of AttributeError
The story of building Zabbix 4.4
Towards the retirement of Python2
[Apache] The story of prefork
Compare the fonts of jupyter-themes
About the ease of Python
Get the number of digits
Settings to debug the contents of the library with VS Code
Reuse the results of clustering
2.x, 3.x character code of python
The process of making Python code object-oriented and improving it
GoPiGo3 of the old man
Calculate the number of changes
Check the code with flake8
Change the theme of Jupyter
The popularity of programming languages
Change the style of matplotlib
Visualize the orbit of Hayabusa2
About the components of Luigi
Connected components of the graph
Filter the output of tracemalloc
About the features of Python
Simulation of the contents of the wallet
The Power of Pandas: Python
Follow the flow of QAOA (VQE) at the source code level of Blueqat
Let's measure the test coverage of pushed python code on GitHub.
First python ② Try to write code while examining the features of python
Explain in detail the magical code for IQ Bot table items
I wrote the code to write the code of Brainf * ck in python
Recording of code for clinical studies rejected by the Ethics Committee
Check the memory protection of linux kerne with the code for ARM
Let's summarize the degree of coupling between modules with Python code
for, continue, break Explain the flow of iterative processing in Python3-Part 1
The specifications of pytz have changed
Test the version of the argparse module
Find the definition of the value of errno
Plot the spread of the new coronavirus
The story of Python and the story of NaN
Raise the version of pyenv itself