[PYTHON] [Introduction to AWS] I tried porting the conversation app and playing with text2speech @ AWS ♪

Recently, I was playing with Shogi AI, so the article was neglected, but I participated in the "Own AWS Hackathon (tentative)" from Friday to Saturday, and I left it because I was able to do the title and make an AWS debut. I'll keep it. All the participants were really beginners in a piggyback ride, but it was a very fulfilling 24 hours.

What I did in 24 hours

** Below is a chronological history. ** ** ・ Connect to aws ・ Check the type of AI solution ・ Confirmation of speech2text ** ・ Shogi AI learning ** ・ Pykakasi confirmation ** ・ Mecab installation @ AWS ** ** ・ Modified QA_conversation for AWS ** ** ・ Record to mp3 file with text2speech @ AWS ** ・ Batch download of mp3 files @TeraTerm **-Publishing mp3 files to S3 #confidential file was in the way ** ** ・ Automatic generation of S3 public audio file **

** The following could not be done due to time out ** 。。。 ・ Mp3 transfer & resend as the conversation progresses 。。。 ・ Voice text conversion by speech2text ・ Conversation on a smartphone ·Complete

To make an article

・ Learn Shogi AI in [email protected] environment ・ Porting conversation apps -Play the converted voice on the Web with polly's text2speech

・ Learn Shogi AI in [email protected] environment

The environment was mostly provided by the organizer, and it was a style of writing progress on the Wiki created on ec2. In other words, create an instance according to your app. So I chose the GPU environment p2.xlarge for machine learning. Now you can use all the frameworks for machine learning. Almost as shown below, but the request for relaxation of restrictions was not necessary because the organizer had set it in advance. 【reference】 If you want to run Chainer on GPU, it's easy to use Deep Learning Base AMI (Ubuntu).

If you also create a key pair, the instance will be launched safely

The flow is as in the above article, but I will add a little supplementary description. This key pair is important and you will need it when connecting to SSH, so you need to store it in a safe place. Also, when creating an instance, the following information is set and the file can be downloaded. This information is essential when creating an app using AWS solutions, so you need to keep it in a safe place.

aws_access_key_id                     
aws_secret_access_key

Connect with Teraterm

-Install TeraTerm. -Enter a host name such as [ec2- \ * \ *-\ * \ *-\ * \ * \ *-\ * \ *. Us-west-2.compute.amazonaws.com] in the host and OK. When you connect, you will be asked further, so enter the user name: ubuntu and the private key pair and connect with OK. -Once connected, copy and paste [source activate chainer_p36] from the following site and paste it back to activate the chainer_p36 environment. 【reference】 Chainer@aws -Zip a set of files, transfer them with the transfer function of TeraTerm, and expand them to an appropriate Dir of ec2.

This time I tried running train_policy_value_resnet.py of Shogi AI, but it took a lot of time compared to doing it in the local environment. For the time being, I was using the GPU, but since the CPU was 100%, I am using CuDNN locally, but I was disappointed that I did not check if it was installed.

Conversation app port

The file is Use the nano @ ubuntu version of the conversation app the other day. I will send this material. Most of all, machine learning does not use DL, so I will recreate a new t2.large instance of ubuntu18.04 and run it. Compared to the price, it is an order of magnitude cheaper as shown below.

p2.xlarge 4 12 61 GiB EBS only 0.9USD/time
t2.large 2 variable 8 GiB EBS only 0.0928USD/time

【reference】 Amazon EC2 Pricing

・ Mecab installation @ AWS

The biggest challenge in porting is installing Mecab. I'm having trouble with any environment, but I'm having trouble with t2.large ubuntu 18.04 on AWS. The basics are as shown below same as for nano. 【reference】 Install mecab on ubuntu 18.10 However, this time too, I got a No module named'Mecab' error. The cause is unknown, but I reinstalled it below and it worked.

sudo pip3 install --no-cache-dir mecab-python3

Also, the dictionary entered without any trouble by the above method. I don't install pyaudio because I don't fix the sound generation, but I installed and confirmed pykakasi. Now it works when I remove the text2speak () function call in qa_conversation_nano.py.

・ Play the converted voice on the Web with text2speech of amazon polly

The app eventually wants to have a voice conversation. So, I wanted to use speech2text and text2speech if they existed, but I decided to use this because AWS only supports text2speech in Japanese.

Voice conversion with Polly's text2speech

AWS has a lot of explanations, but it seems that you need to get used to getting to the point. So I found the following code. If you use this, the text entered in Text will be converted to an audio file, and the mp3 file will be placed in the specified Dir of the ec2 server. Here, enter the key generated when the above instance was created on the right side of each of aws_access_key_id = and aws_secret_access_key = ,.

SynthesizeSpeech@AWS

import boto3

polly_client = boto3.Session(
                aws_access_key_id=,                     
    aws_secret_access_key=,
    region_name='us-west-2').client('polly')

response = polly_client.synthesize_speech(VoiceId='Joanna',
                OutputFormat='mp3', 
                Text = 'This is a sample text to be synthesized.')

file = open('speech.mp3', 'wb')
file.write(response['AudioStream'].read())
file.close()

For Japanese, set VoiceId ='Mizuki' and put Japanese in Text. It seems that you can make various adjustments, but this time it is the default due to time constraints. Amazon Polly Voice

Transfer from ec2 to s3

I was able to do it with the following code. Actually, there was a problem here. In other words, I got an error that the keys are different. This was because authentication was occurring during this transfer and it automatically went to look at the confidential file and read the old key. So, this time I deleted this file and uploaded it normally.

# -*- coding: utf-8 -*-
import boto3

s3 = boto3.resource('s3') #Get S3 object

bucket = s3.Bucket('s3 bucket-name')
bucket.upload_file('Specify ec2 mp3 file', 'dir specification of bucket of s3')
s3_client = boto3.client('s3')

# Upload the file to S3
s3_client.upload_file('test.txt', 'bucket-name', 'test-remote.txt')

s3 Output the audio of the public server

At this point, you can generate audio with normal html.

【reference】

<html>
<body>
<figure>
    <figcaption>Listen to the Answer:</figcaption>
    <audio
        autoplay
        controls
        src="https://******.s3-****.amazonaws.com/mp3/speech0.mp3">
            Your browser does not support the
            <code>audio</code> element.
    </audio>
</figure>
</body>
</html>

Summary

・ I was able to make my AWS debut by participating in the "Own AWS Hackathon (tentative)" ・ I tried playing the conversation app on AWS ・ Text2speech was unclear

・ Speech2text did not support Japanese on AWS, so it is necessary to consider countermeasures. ・ Conversation on a smartphone

Recommended Posts

[Introduction to AWS] I tried porting the conversation app and playing with text2speech @ AWS ♪
[Introduction to AWS] I tried playing with voice-text conversion ♪
I tried playing with the image with Pillow
[Introduction to AWS] Text-Voice conversion and playing ♪
[Introduction to AWS] I played with male and female voices with Polly and Transcribe ♪
I tried to express sadness and joy with the stable marriage problem.
I tried to learn the angle from sin and cos with chainer
I tried to control the network bandwidth and delay with the tc command
I tried to save the data with discord
I tried playing with the calculator on tkinter
[Introduction to PID] I tried to control and play ♬
I tried to learn the sin function with chainer
I tried to read and save automatically with VOICEROID2 2
I tried to implement and learn DCGAN with PyTorch
I tried to touch the CSV file with Python
I tried to solve the soma cube with python
[Introduction to Pytorch] I tried categorizing Cifar10 with VGG16 ♬
I tried to solve the problem with Python Vol.1
I tried to implement Grad-CAM with keras and tensorflow
I tried to compare the processing speed with dplyr of R and pandas of Python
I tried to predict and submit Titanic survivors with Kaggle
I tried to find the entropy of the image with python
I tried to simulate how the infection spreads with Python
I tried to analyze the whole novel "Weathering with You" ☔️
I tried to find the average of the sequence with TensorFlow
I tried to notify the train delay information with LINE Notify
I tried to illustrate the time and time in C language
I tried to enumerate the differences between java and python
I tried to make GUI tic-tac-toe with Python and Tkinter
I tried to launch ipython cluster to the minimum on AWS
I tried to divide the file into folders with Python
I displayed the chat of YouTube Live and tried playing
I tried to automatically post to ChatWork at the time of deployment with fabric and ChatWork Api
I tried to get the number of days of the month holidays (Saturdays, Sundays, and holidays) with python
I also tried to imitate the function monad and State monad with a generator in Python
Introduction to AI creation with Python! Part 1 I tried to classify and predict what the numbers are from the handwritten number images.
I tried to move the ball
I tried to estimate the interval.
[AWS / Tello] I tried operating the drone with my voice Part2
I tried to solve the ant book beginner's edition with python
I tried to automate the watering of the planter with Raspberry Pi
I tried to visualize bookmarks flying to Slack with Doc2Vec and PCA
[Introduction to simulation] I tried playing by simulating corona infection ♬ Part 2
I tried to process the image in "sketch style" with OpenCV
I tried to make a periodical process with Selenium and Python
[Introduction to sinGAN-Tensorflow] I played with the super-resolution "Challenge Big Imayuyu" ♬
I tried to get started with Bitcoin Systre on the weekend
[AWS / Tello] I tried operating the drone with my voice Part1
I tried to create Bulls and Cows with a shell program
I tried to expand the size of the logical volume with LVM
I tried to easily detect facial landmarks with python and dlib
[Introduction to RasPi4] I played with "Hiroko / Hiromi's poisonous tongue conversation" ♪
I tried to improve the efficiency of daily work with Python
I tried to unlock the entrance 2 lock sesame with a single push of the AWS IoT button
I tried to create serverless batch processing for the first time with DynamoDB and Step Functions
[Python] I tried to visualize the night on the Galactic Railroad with WordCloud!
I tried to delete bad tweets regularly with AWS Lambda + Twitter API
I tried to refer to the fun rock-paper-scissors poi for beginners with Python
I tried to summarize until I quit the bank and became an engineer
I tried moving the image to the specified folder by right-clicking and left-clicking
I tried to implement Autoencoder with TensorFlow