[PYTHON] Transcription d'images avec l'API Vision de GCP

Transcription d'images avec l'API Vision

'type': est "TEXT_DETECTION" Il y en a deux, "DOCUMENT_TEXT_DETECTION", et ce dernier est sélectionné.

#coding:utf-8
import base64
import json
from requests import Request, Session
from io import BytesIO
from PIL import Image
import pandas as pd

#Lancer une image ouverte dans PIL vers l'API Cloud Vision
def recognize_image(pil_image):
        def pil_image_to_base64(img_path):
            pil_image = Image.open(img_path)
            buffered = BytesIO()
            pil_image.save(buffered, format="PNG")
            str_encode_file = base64.b64encode(buffered.getvalue()).decode("utf-8")
            return str_encode_file
        
        def get_fullTextAnnotation(json_data):
            text_dict = json.loads(json_data)
            try:
                text = text_dict["responses"][0]["fullTextAnnotation"]["text"]
                return text
            except:
                print(None)
                return None

        str_encode_file = pil_image_to_base64(pil_image)###Effacez ici à la fin
        str_url = "https://vision.googleapis.com/v1/images:annotate?key="
        str_api_key = "Clé API"
        str_headers = {'Content-Type': 'application/json'}
        str_json_data = {
            'requests': [
                {
                    'image': {
                        'content': str_encode_file
                    },
                    'features': [
                        {
                            'type': "DOCUMENT_TEXT_DETECTION",
                            'maxResults': 10
                        }
                    ]
                }
            ]
        }

        obj_session = Session()
        obj_request = Request("POST",
                              str_url + str_api_key,
                              data=json.dumps(str_json_data),
                              headers=str_headers
                              )
        obj_prepped = obj_session.prepare_request(obj_request)
        obj_response = obj_session.send(obj_prepped,
                                        verify=True,
                                        timeout=60
                                        )

        if obj_response.status_code == 200:
            text = get_fullTextAnnotation(obj_response.text)
            
            return text

Recogn_image ("chemin de l'image")

Recommended Posts

Transcription d'images avec l'API Vision de GCP

Flux d'extraction de texte au format PDF avec l'API Cloud Vision

Classification multi-étiquette d'images multi-classes avec pytorch

Comment utiliser l'API Cloud Vision de GCP

Créer une API pour le thermo-hygromètre Switchbot avec Node-RED

Mélangez des centaines de milliers d'images uniformément avec tensorflow.

Problèmes avec les résultats de sortie avec l'API Cloud Vision de Google

Extraction de texte avec l'API GCP Cloud Vision (Python3.6)

Afficher les images sur S3 avec API Gateway + Lambda

J'ai essayé "License OCR" avec l'API Google Vision