I tried using the Google Cloud Vision API

What is Google Cloud Vision?

Google Cloud Vision is an image analysis service provided by Google. This time, I used it to read the text in the image.

Step 1 Register with Google Cloud Platform

Register for Google Cloud Platform from the link below. A credit card is required for registration. https://console.cloud.google.com/getting-started?hl=ja

Enable Cloud Vision API

Enter "Cloud Vision API" in the search window at the top of the screen. スクリーンショット 2020-10-02 17.34.47.png

Enable the Cloud Vision API on the screen after the transition and you're done. スクリーンショット 2020-10-02 17.35.14.png

Create a service account

What is a service account? Quoted from here → https://cloud.google.com/iam/docs/service-accounts?hl=ja

A service account is a special account used by an application or virtual machine (VM) instance, not a user. The application uses the service account to make authorized API calls.

Now let's create a service account. Click the service account from "IAM and Management". スクリーンショット 2020-10-02 17.57.26.png

Click "Create Service Account" on the screen after the transition. スクリーンショット 2020-10-02 18.00.30.png

Enter an appropriate service account name and click "Create" スクリーンショット 2020-10-02 18.02.16.png

Click "Continue" スクリーンショット 2020-10-02 18.08.46.png

Click "Finish" スクリーンショット 2020-10-02 19.44.21.png

Next is the creation of a private key for authentication. After performing the above operation, I think that the screen below is displayed, so click "Operation" to create a key.

A dialog will appear where you can select JSON and create the key. Place the key in any folder and specify the key path in the environment variable (GOOGLE_APPLICATION_CREDENTIALS) described later.

Install gem.

gem 'google-cloud-vision'

Set the private key path to an environment variable and you're ready to go.

export GOOGLE_APPLICATION_CREDENTIALS="/hoge/fuga.json"

Implementation

With reference to the official documentation, it is as follows. ・ Official document https://cloud.google.com/vision/docs/libraries?hl=ja#client-libraries-usage-ruby

The biggest difference from the official example is that it uses text_detection instead of label_detection. You can use label_detection to detect what is in the image. For example, if you take a picture of the inside of a station, things like ticket gates, railroad tracks, and ticket vending machines will be detected. This time I used text_detection because I wanted to detect the characters in the image.

@image = Image.new

require "google/cloud/vision"
image_annotator = Google::Cloud::Vision.image_annotator

#Specify the image to read
file_name = "~/hoge.jpg "

# file_Analyze with Cloud Vision with name as an argument
response = image_annotator.text_detection image: file_name

response.responses.each do |res|
  @image.OCR = res.text_annotations[0].description
end

Summary

In addition to this example, the Google Cloud Vision API can also be used for analyzing PDF files and detecting faces in images. You can load up to 1000 items a month for free, so let's try various things and incorporate them into your own app!