[PYTHON] Creating a voice transcription web application

conversion

I wanted to transcribe the sound source recorded on the iphone, so I converted the m4a file to a wav file. First,

!pip install pydub

When I installed pydub and tried to run it,

 [Errno 2] No such file or directory: 'ffprobe': 'ffprobe'

I got an error like

I downloaded ffmpeg.exe from the following site Download FFmpeg It was a compressed file called ffmpeg-97026-gea46b45e9c.7z.

Problems with AudioSegment.from_mp3

I also moved the file to a location in my PATH.

$printenv
Or
echo $PATH

I checked my PATH in.

If you have ffmpeg, I don't think you need to use pydub,

pip install ffmpeg-python

Install ffmpeg-python with. I was able to convert successfully as follows.

import ffmpeg
stream = ffmpeg.input("sample.m4a")
stream = ffmpeg.output(stream, 'output.wav')
ffmpeg.run(stream)

web app

server.py


from flask import Flask, render_template, request,send_file,after_this_request,make_response,jsonify,redirect, url_for, send_from_directory
import pandas as pd
import os
import ffmpeg
import wave

app = Flask(__name__)

UPLOAD_DIR = './uploads'
ALLOWED_EXTENSIONS = set(['m4a','mp3','wav',])
app.config['UPLOAD_FOLDER'] = UPLOAD_DIR


@app.route('/')
def hello():
    return render_template('index.html')


def allwed_file(filename):
    # .Check if there is, and check the extension
    #1 if OK, 0 if not
    return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS



def transcribe_file(speech_file,num):
    """Transcribe the given audio file."""
    from google.cloud import speech
    from google.cloud.speech import enums
    from google.cloud.speech import types
    client = speech.SpeechClient()

    with open(speech_file, 'rb') as audio_file:
        content = audio_file.read()

    if speech_file.encode == 'flac':
        encode = speech.enums.RecognitionConfig.AudioEncoding.FLAC
    elif speech_file.encode == 'wav':
        encode = speech.enums.RecognitionConfig.AudioEncoding.LINEAR16
    elif speech_file.encode == 'ogg':
        encode = speech.enums.RecognitionConfig.AudioEncoding.OGG_OPUS
    elif speech_file.encode == 'amr':
        encode = speech.enums.RecognitionConfig.AudioEncoding.AMR
    elif speech_file.encode == 'awb':
        encode = speech.enums.RecognitionConfig.AudioEncoding.AMR_WB
    else:
        encode = speech.enums.RecognitionConfig.AudioEncoding.LINEAR16



    audio = types.RecognitionAudio(content=content)
    config = types.RecognitionConfig(
        encoding=encode,
        
        sample_rate_hertz=num,
        language_code='ja-JP')
    

    response = client.recognize(config, audio)
    
    result_list=[]
    for result in response.results:
        
        result_list.append(result.alternatives[0].transcript)
        
    
    

    return result_list
    
@app.route('/result', methods=['POST'])
def uploads_file():
    
    #Determining if the request is a post
    if request.method == 'POST':
        #What to do if the file does not exist
        if 'file' not in request.files:
            make_response(jsonify({'result':'uploadFile is required.'}))
           
        #Data retrieval
        file = request.files['file']
        
        #Processing when there is no file name
        if file.filename == '':
            make_response(jsonify({'result':'filename must not empty.'}))
            
            
        #File check
        if file and allwed_file(file.filename):
            
            filename = file.filename


            #Save file
            file.save(os.path.join(app.config['UPLOAD_FOLDER'],filename))
            
            
            stream = ffmpeg.input("uploads/" + filename)
            stream = ffmpeg.output(stream, 'output1.wav')
            ffmpeg.run(stream)

            wfile = wave.open('output1.wav', "r")
            frame_rate = wfile.getframerate()
            print(frame_rate)
            result_list = transcribe_file('output1.wav',frame_rate)
            
            
            os.remove('output1.wav')

            return render_template('result.html',result_list=result_list)

            
            
    return  



if __name__ == "__main__":
    app.run(debug=True)

index.html


<!DOCTYPE html>
<html lang="ja">
  <head>
    
  </head>
  <body>
  <div class="index">
    <form method="post" action="/result" enctype="multipart/form-data" class="index">
      <div>Audio file upload</div>
      <label>
      
      <div class="inputindex">Select files</div>
      <input type="file" name="file" size="30" class="index">
      </label>
     <div>
      <button type="submit" formmethod="POST" class="index">Send</button>
      </div>
    </form>
    </div>
 </body>
</html>

result.html


<html lang="ja">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width,initial-scale=1.0,minimum-scale=1.0">
    <link rel="stylesheet" href="{{url_for('static', filename='index.css')}}">
    
  </head>
  <body>
  
  {% for result in result_list %}
  <ul>{{result}}</ul>
  {% endfor %}
</body>
</html>
export GOOGLE_APPLICATION_CREDENTIALS="[PATH]"

↑ Put the PATH of the json file for authentication in PATH

Reference site

Cloud Speech-to-Text Let's make an original voice assistant (3) --Cloud Speech API Active engineers explain how to recognize speech with Python [for beginners]

↓ This article is recommended. [Investigate various Google Cloud Speech-to-Text APIs](https://tech-blog.optim.co.jp/entry/2020/02/21/163000#%E5%AE%9F%E9% 9A% 9B% E3% 81% AB% E8% A9% A6% E3% 81% 97% E3% 81% A6% E3% 81% BF% E3% 82% 8B)

[Python] I compared using two types of libraries of Google Cloud Speech-to-Text API

Sites with audio suitable for testing

Recommended Posts

Creating a voice transcription web application
Creating a web application using Flask ①
Creating a web application using Flask ③
Creating a web application using Flask ④
Build a web application with Django
Display matplotlib diagrams in a web application
[Python] A quick web application with Bottle!
Creating a data analysis application using Streamlit
Run a Python web application with Docker
I tried benchmarking a web application framework
I made a WEB application with Django
[GCP] Procedure for creating a web application with Cloud Functions (Python + Flask)
Creating an interactive application using a topic model
Steps to develop a web application in Python
What I was addicted to when creating a web application in a windows environment
A story about creating a web application that automatically generates Minecraft sound block performances
Try creating a web application with Vue.js and Django (Mac)-(1) Environment construction, application creation
[Day 9] Creating a model
Creating a Home screen
4. Creating a structured program
Looking back on creating a web service with Django 1
Launch a Python web application with Nginx + Gunicorn with Docker
Creating a scraping tool
Looking back on creating a web service with Django 2
Web application using Bottle (1)
Creating a dataset loader
Create a web application that recognizes numbers with a neural network
The first step to creating a serverless application with Zappa
(For beginners) Try creating a simple web API with Django
WEB application development using django-Development 1-
Web application development with Flask
Try creating a CRUD function
Web application creation with Django
Web application with Python + Flask ② ③
Web application with Python + Flask ④
Let's turn PES analysis software into a WEB application! First step!
If you know Python, you can make a web application with Django
Let's make a WEB application for phone book with flask Part 1
Build a Flask / Bottle-like web application on AWS Lambda with Chalice
Let's make a WEB application for phone book with flask Part 2
Let's make a WEB application for phone book with flask Part 3
How to deploy a web application on Alibaba Cloud as a freelancer
A popular web drinking party, Python detects silence and plays voice.
The road to web application development is a long way off
Let's make a WEB application for phone book with flask Part 4
Launched a web application on AWS with django and changed jobs