I saw article that tensorflow released an API that makes object recognition easier on techcrunch, so I tried it immediately I did. You can read the detailed explanation at github.

First install

Basic configuration

OS : macOS Sierra 10.12.5 python environment: anaconda3-4.2.0 (python 3.5.2) tensorflow: v1.2.0 (install was done in advance)

models clone

I used the keras model all the time, and I had never used the tensorflow model, so I cloned it. As for the place, I made ~ / tensorflow according to the tutorial. In that, I cloned models. There were various models, so if I have time, I'll try it.

$ cd
$ mkdir tensorflow 
$ cd tensorflow
$ git clone https://github.com/tensorflow/models.git

setup There were various things written, so I added various things.

First of all, I picked up some things that pip is missing. I didn't have lxml in my environment, so I added it. You can check if it is enough or not with the command below.

$ pip freeze

Next, I had to install Protobuf, but since it was written in the body of linux in the tutorial installation, I installed it using homebrew instead of apt-get.

$ brew install protobuf

Then compile, configure, and test. Execute the following command in ~ / tensorflow / models / research.

$ protoc object_detection/protos/*.proto --python_out=.
$ export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
$ python object_detection/builders/model_builder_test.py

If you get an OK message, the installation is probably complete.

Operation test

Run jupyter notebook at ~ / tensorflow / models / research / object_detection. Just open ʻobject_detection_tutorial.ipynb. The test images are in ~ / tensorflow / models / research / object_detection / test_images. Execute from the cell above, and when you reach the end, the execution should be completed. If you want to try your own image immediately, replace ʻimage1.jpg or ʻimage2.jpg in test_image` and execute, or use the penultimate cell.

# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
# TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]
TEST_IMAGE_PATHS = ['File name of your favorite image']
# Size, in inches, of the output images.
IMAGE_SIZE = (12,8)

You can run it as. You can save the output image by rewriting the last cell.

with detection_graph.as_default():
  with tf.Session(graph=detection_graph) as sess:
    for image_path in TEST_IMAGE_PATHS:
      image = Image.open(image_path)
      # the array based representation of the image will be used later in order to prepare the
      # result image with boxes and labels on it.
      image_np = load_image_into_numpy_array(image)
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
      # Each box represents a part of the image where a particular object was detected.
      boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
      # Each score represent how level of confidence for each of the objects.
      # Score is shown on the result image, together with the class label.
      scores = detection_graph.get_tensor_by_name('detection_scores:0')
      classes = detection_graph.get_tensor_by_name('detection_classes:0')
      num_detections = detection_graph.get_tensor_by_name('num_detections:0')
      # Actual detection.
      (boxes, scores, classes, num_detections) = sess.run(
          [boxes, scores, classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})
      # Visualization of the results of a detection.
      vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          np.squeeze(boxes),
          np.squeeze(classes).astype(np.int32),
          np.squeeze(scores),
          category_index,
          use_normalized_coordinates=True,
          line_thickness=8)
      print(image_path.split('.')[0]+'_labeled.jpg') #For confirmation
      plt.figure(figsize=IMAGE_SIZE, dpi=300) #You will be able to read characters if you play with dpi
      plt.imshow(image_np)
      plt.savefig(image_path.split('.')[0] + '_labeled.jpg') #Add here

Output example

There are some non-cow guys, but it's fair.

Finally

It seems that this area was already done in Keras, so I would like to do that next time. I would like to try it with a video. This article looks good.