[PYTHON] Extract characters from images using docomo's character recognition API

Thing you want to do

I want to extract characters from an image using docomo's API.

** Addendum (2016/02/16) ** The program has been uploaded to GitHub. → source

Advance preparation

--Register with docomo Developer support to get the API key.

Image file to recognize (test.jpg)

test.jpg

Development environment

--python 2.7 series

program

characterRecognition.py


#coding: utf-8
from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
import urllib2
import json
import time 
import urllib
import re
import sys

#Throw image data and get image ID in json format(Scene image recognition request)
def getImageID(fname):
    register_openers()
    url = 'https://api.apigw.smt.docomo.ne.jp/characterRecognition/v1/document?APIKEY=(Please enter the API key)'
    
    f = open(fname, 'r')

    datagen, headers = multipart_encode({"image": f, 'lang': 'jpn'})
    request = urllib2.Request(url,datagen, headers)
    response = urllib2.urlopen(request)
    
    res_dat = response.read()
    return json.loads(res_dat)['job']['@id'] #Returns the ID of the image

#Extract only the word from the acquired json.
def makeWordList(result):
    
    word_list = []
    count  = int(result['lines']['@count'])

    for i in range(count):
        word = result['lines']['line'][i]['@text']
        word_list.append(word)

    return word_list

#Scene image recognition result acquisition
def getWordList(img_id):

    register_openers()
    url = 'https://api.apigw.smt.docomo.ne.jp/characterRecognition/v1/document/' + img_id + '?APIKEY=(Please enter the API key)'
    
    request = urllib2.Request(url)
    
    recog_result = {}
    for i in range(5):
        response = urllib2.urlopen(request)
        res_dat = response.read()
        
        recog_result = json.loads(res_dat)
        
        status = recog_result['job']['@status']
        
        if status == 'queue':
            print 'Accepting...'
        elif status == 'process':
            print 'Recognizing...'
        elif status == 'success':
            print 'Successful recognition' #, recog_result
            word_list = makeWordList(recog_result)
            return word_list
        elif status == 'failure':
            print 'Recognition failure'
            return None

        time.sleep(3) #wait a little



if __name__ == '__main__':
    
    #Get image ID
    img_id = getImageID(sys.argv[1])
    
    #Get word list
    word_list = getWordList(img_id)
    
    #Display the recognized character string
    for word in word_list:
        print word

Execution result

>python characterRecognition.py test.jpg
Recognizing...
Successful recognition
Character recognition test

Supplement (example of acquired json)

{
  "job": {
    "@status": "success",
    "@id": "(Image ID)", #Image ID
    "@queue-time": "2016/02/13 17:03:07"
  },
  "lines": {
    "line": [
      {
        "@text": "\u6587\u5b57\u8a8d\u8b58\u306e\u30c6\u30b9\u30c8", #Recognized string
        "shape": {
          "@count": "4",
          "point": [       #Coordinates on the image of the text(Upper left, lower left, lower right, upper right)
            {
              "@x": "35",
              "@y": "33"
            },
            {
              "@x": "35",
              "@y": "67"
            },
            {
              "@x": "293",
              "@y": "67"
            },
            {
              "@x": "293",
              "@y": "33"
            }
          ]
        }
      }
    ],
    "@count": "1"
  },
  "message": null
}

Afterword

For the time being, I wrote it for a memo. If you would like to know more, please leave a comment.

Recommended Posts

Extract characters from images using docomo's character recognition API
Category estimation using docomo's image recognition API
# 5 [python3] Extract characters from a character string
Detect Japanese characters from images using Google's Cloud Vision API in Python
Character recognition from images! Explaining logic with 100% accuracy using PES as a theme
Age recognition using Pepper's API
Image recognition with API from zero knowledge using AutoML Vision
Download images from "Irasutoya" using Scrapy
Geotag prediction from images using DNN
Extract text from images in Python
Facial expression recognition using Pepper's API
Run Ansible from Python using API
I tried to extract characters from subtitles (OpenCV: Google Cloud Vision API)
Extract images from cifar and CUCUMBER-9 datasets
I tried using UnityCloudBuild API from Python
[Python] (Line) Extract values from graph images
Load images from URLs using Pillow in Python 3
Anonymous upload of images using Imgur API (using Python)
I tried to automatically collect erotic images from Twitter using GCP's Cloud Vision API