[PYTHON] Transcription of images with GCP's Vision API

Transcription of images with Vision API

'type': is "TEXT_DETECTION" There are two, "DOCUMENT_TEXT_DETECTION", and the latter is selected.

#coding:utf-8
import base64
import json
from requests import Request, Session
from io import BytesIO
from PIL import Image
import pandas as pd

#Throw an image opened in PIL to the Cloud Vision API
def recognize_image(pil_image):
        def pil_image_to_base64(img_path):
            pil_image = Image.open(img_path)
            buffered = BytesIO()
            pil_image.save(buffered, format="PNG")
            str_encode_file = base64.b64encode(buffered.getvalue()).decode("utf-8")
            return str_encode_file
        
        def get_fullTextAnnotation(json_data):
            text_dict = json.loads(json_data)
            try:
                text = text_dict["responses"][0]["fullTextAnnotation"]["text"]
                return text
            except:
                print(None)
                return None

        str_encode_file = pil_image_to_base64(pil_image)###Erase here at the end
        str_url = "https://vision.googleapis.com/v1/images:annotate?key="
        str_api_key = "API key"
        str_headers = {'Content-Type': 'application/json'}
        str_json_data = {
            'requests': [
                {
                    'image': {
                        'content': str_encode_file
                    },
                    'features': [
                        {
                            'type': "DOCUMENT_TEXT_DETECTION",
                            'maxResults': 10
                        }
                    ]
                }
            ]
        }

        obj_session = Session()
        obj_request = Request("POST",
                              str_url + str_api_key,
                              data=json.dumps(str_json_data),
                              headers=str_headers
                              )
        obj_prepped = obj_session.prepare_request(obj_request)
        obj_response = obj_session.send(obj_prepped,
                                        verify=True,
                                        timeout=60
                                        )

        if obj_response.status_code == 200:
            text = get_fullTextAnnotation(obj_response.text)
            
            return text

recognize_image ("image path")

Recommended Posts

Transcription of images with GCP's Vision API
Flow of extracting text in PDF with Cloud Vision API
Multi-class, multi-label classification of images with pytorch
How to use GCP's Cloud Vision API
Make API of switchbot thermo-hygrometer with Node-RED
Shuffle hundreds of thousands of images evenly with tensorflow.
Problems with output results with Google's Cloud Vision API
Text extraction with GCP Cloud Vision API (Python3.6)
View images on S3 with API Gateway + Lambda
I tried "License OCR" with Google Vision API
Categorize face images of anime characters with Chainer
Automatic voice transcription with Google Cloud Speech API
Anonymous upload of images using Imgur API (using Python)
I tried "Receipt OCR" with Google Vision API
Wavelet transform of images with PyWavelets and OpenCV
Get data labels by linking with Google Cloud Vision API when previewing images with Rails
Recent Ability of Image Recognition-MS State-of-the-art Research Results Using Computer Vision API with Python
I tried to automatically collect erotic images from Twitter using GCP's Cloud Vision API
Display embedded images of mp3 and flac with mutagen
Try projective transformation of images using OpenCV with Python
Create a batch of images and inflate with ImageDataGenerator
I tried "morphology conversion" of images with Python + OpenCV
Text extraction (Read API) with Azure Computer Vision API (Python3.6)
Upload videos using YouTube API
Speech transcription procedure using Google Cloud Speech API
Transcription of images with GCP's Vision API
Identify the YouTube channel of Hikakin videos from thumbnail images using CNN
How to download youtube videos using pytube3
Try to download Youtube videos using Pytube
Speech transcription procedure using Python and Google Cloud Speech API
A story of reading a picture book by synthesizing voice with COTOHA API and Cloud Vision API
Extrude with Fusion360 API
Center images with python-pptx
Collect large numbers of images using Bing's image search API
Image recognition with API from zero knowledge using AutoML Vision
Let's touch the API of Netatmo Weather Station with Python. #Python #Netatmo
The story of displaying images with OpenCV or PIL (only)
Build a speed of light web API server with Falcon
Speech recognition of wav files with Google Cloud Speech API Beta
Get stock articles of infrastructure engineer yuta with Qiita API
Create a web API that can deliver images with Django
Create playlists of bright songs only with Spotify Web API