[PYTHON] Streaming speech recognition with Google Cloud Speech API

Try streaming speech recognition from microphone input with Google Cloud Speech API.

Previously I tried to recognize recorded files with REST API version, so this time I will try streaming recognition with gRPC version.

procedure

Google official sample Follow the README procedure in.

This time I will try streaming recognition transcript_streaming.py.

Same procedure as REST version until getting json of Service Account.

  1. Sign up for Google Cloud platform
  2. Create a project in the Developer console, enable the Speech API, and get the Service Account json file for authentication.
  3. Set the downloaded json file to the environment variable GOOGLE_APPLICATION_CREDENTIALS
  4. Run sample script
  5. Enable port audio
  6. Install the required pip module (virtualenv recommended)
  7. Set transcribe_streaming.py to recognize Japanese
  8. Change the language_code of recognition_config from en-US to ja-JP
  9. Adjust the sampling rate etc. to suit your environment
  10. The setting around the device is record_audio, which is the method of pyaudio.
  11. Run the sample in $ python transcribe_streaming.py and speak into the microphone

When started, recognition continues as long as service.StreamingRecognize returns a value in listen_print_loop. (It ends with a timeout when the number of seconds of DEADLINE_SECS elapses).

This sample finishes processing when the statement contains the words ʻexit or quit(the latter half of * listen_print_loop *), so these words can be stopped asstop or end`. If you change it, you can do the same in Japanese.

Cognitive behavior

――Until there is silence for a certain period of time, it is recognized as a continuous utterance even if there is some time. --Once recognized, ʻis_final = Trueandconfidence are returned with the resulting text. -If you specify ʻinterim_results = True in * streaming_config *, you can get the recognition result during the utterance.

The recognition in the middle of the utterance seems to be done at the word level, and I am surprised at a speed that I can not think through the network. However, the recognition result in the middle may be wrong, so if you do not hurry, it will end all It's better to wait.

See the gRPC API Manual (https://cloud.google.com/speech/reference/rpc/google.cloud.speech.v1beta1#google.cloud.speech.v1beta1.Speech.StreamingRecognize) for other options.

The Github code is updated quite often, so you should check it daily.

Bug

I tried it with the built-in microphone of the laptop / external microphone of USB with MAC and Linux respectively, but after about 3-10 utterances or 15-30 seconds, they do not recognize without any error. Investigation required.

Miscellaneous feelings

Since it is v1beta1, it seems that it is still in the testing stage. It seems difficult to use it correctly unless you are accustomed to gRPC (and how to handle it from pyton).

Recommended Posts

Streaming speech recognition with Google Cloud Speech API
Speech recognition of wav files with Google Cloud Speech API Beta
Automatic voice transcription with Google Cloud Speech API
Google Cloud Speech API vs. Amazon Transcribe
Speech transcription procedure using Google Cloud Speech API
Transcribe WAV files with Cloud Speech API
Stream speech recognition using Google Cloud Speech gRPC API on python3 on Mac!
[GCP] [Python] Deploy API serverless with Google Cloud Functions!
Speech transcription procedure using Python and Google Cloud Speech API
Speech file recognition by Google Speech API v2 using Python
Introducing Google Map API with rails
I tried using docomo speech recognition API and Google Speech API in Java
Google Cloud Vision API sample for python
English speech recognition with python [speech to text]
Easy introduction of speech recognition with Python
Try using Python with Google Cloud Functions
Use Google Cloud Vision API from Python
[GCP] Operate Google Cloud Storage with Python
Get holidays with the Google Calendar API
Serverless face recognition API made with Python
Extract sudden buzzwords with twitter streaming API
Automatic follow-back using streaming api with Tweepy
Text extraction with GCP Cloud Vision API (Python3.6)
I tried "License OCR" with Google Vision API
Display Google Maps API with Rails and pin display
I tried using the Google Cloud Vision API
Comparison of cloud speech recognition accuracy of 4 major companies
How to use the Google Cloud Translation API
Until you can use the Google Speech API
I tried "Receipt OCR" with Google Vision API
[Google Cloud Platform] Use Google Cloud API using API Client Library
Get data labels by linking with Google Cloud Vision API when previewing images with Rails
Investigation of the relationship between speech preprocessing and transcription accuracy in the Google Cloud Speech API
Book registration easily with Google Books API and Rails
Create a tweet heatmap with the Google Maps API
A story linked with Google Cloud Storage with a little ingenuity
Use of Google Cloud Storage (GCS) with "GAE / Py"
How to analyze with Google Colaboratory using Kaggle API
Run Google Cloud Functions locally with Cloud Native Build packs
Upload to a shared drive with Google Drive API V3
Get tweets with arbitrary keywords using Twitter's Streaming API
Pick up only crispy Japanese with Twitter streaming API
Face recognition with Edison
Image recognition with keras
Speech recognition in Python
Extrude with Fusion360 API
Point Cloud with Pepper
Easy to use Nifty Cloud API with botocore and python
Flow of extracting text in PDF with Cloud Vision API
Hello World with Google App Engine (Java 8) + Servlet API 3.1 + Gradle
I tried Google Sign-In with Spring Boot + Spring Security REST API
Image recognition with API from zero knowledge using AutoML Vision
Make objects recognized with IBM Watson Developer Cloud Visual Recognition
Try to determine food photos using Google Cloud Vision API
Get data from analytics API with Google API Client for python
I tried the Google Cloud Vision API for the first time
Let's publish the super resolution API using Google Cloud Platform
Issue reverse geocoding in Japanese with Python Google Maps API
Upload and delete files to Google Cloud Storages with django-storage
Play with YouTube Data API v3 using Google API Python Client
Python: Extract file information from shared drive with Google Drive API