Recently, I was playing with Shogi AI, so the article was neglected, but I participated in the "Own AWS Hackathon (tentative)" from Friday to Saturday, and I left it because I was able to do the title and make an AWS debut. I'll keep it. All the participants were really beginners in a piggyback ride, but it was a very fulfilling 24 hours.
** Below is a chronological history. ** ** ・ Connect to aws ・ Check the type of AI solution ・ Confirmation of speech2text ** ・ Shogi AI learning ** ・ Pykakasi confirmation ** ・ Mecab installation @ AWS ** ** ・ Modified QA_conversation for AWS ** ** ・ Record to mp3 file with text2speech @ AWS ** ・ Batch download of mp3 files @TeraTerm **-Publishing mp3 files to S3 #confidential file was in the way ** ** ・ Automatic generation of S3 public audio file **
** The following could not be done due to time out ** 。。。 ・ Mp3 transfer & resend as the conversation progresses 。。。 ・ Voice text conversion by speech2text ・ Conversation on a smartphone ·Complete
・ Learn Shogi AI in [email protected] environment ・ Porting conversation apps -Play the converted voice on the Web with polly's text2speech
The environment was mostly provided by the organizer, and it was a style of writing progress on the Wiki created on ec2. In other words, create an instance according to your app. So I chose the GPU environment p2.xlarge for machine learning. Now you can use all the frameworks for machine learning. Almost as shown below, but the request for relaxation of restrictions was not necessary because the organizer had set it in advance. 【reference】 If you want to run Chainer on GPU, it's easy to use Deep Learning Base AMI (Ubuntu).
The flow is as in the above article, but I will add a little supplementary description. This key pair is important and you will need it when connecting to SSH, so you need to store it in a safe place. Also, when creating an instance, the following information is set and the file can be downloaded. This information is essential when creating an app using AWS solutions, so you need to keep it in a safe place.
aws_access_key_id
aws_secret_access_key
-Install TeraTerm. -Enter a host name such as [ec2- \ * \ *-\ * \ *-\ * \ * \ *-\ * \ *. Us-west-2.compute.amazonaws.com] in the host and OK. When you connect, you will be asked further, so enter the user name: ubuntu and the private key pair and connect with OK. -Once connected, copy and paste [source activate chainer_p36] from the following site and paste it back to activate the chainer_p36 environment. 【reference】 Chainer@aws -Zip a set of files, transfer them with the transfer function of TeraTerm, and expand them to an appropriate Dir of ec2.
This time I tried running train_policy_value_resnet.py of Shogi AI, but it took a lot of time compared to doing it in the local environment. For the time being, I was using the GPU, but since the CPU was 100%, I am using CuDNN locally, but I was disappointed that I did not check if it was installed.
The file is Use the nano @ ubuntu version of the conversation app the other day. I will send this material. Most of all, machine learning does not use DL, so I will recreate a new t2.large instance of ubuntu18.04 and run it. Compared to the price, it is an order of magnitude cheaper as shown below.
p2.xlarge 4 12 61 GiB EBS only 0.9USD/time
t2.large 2 variable 8 GiB EBS only 0.0928USD/time
【reference】 Amazon EC2 Pricing
The biggest challenge in porting is installing Mecab. I'm having trouble with any environment, but I'm having trouble with t2.large ubuntu 18.04 on AWS. The basics are as shown below same as for nano. 【reference】 Install mecab on ubuntu 18.10 However, this time too, I got a No module named'Mecab' error. The cause is unknown, but I reinstalled it below and it worked.
sudo pip3 install --no-cache-dir mecab-python3
Also, the dictionary entered without any trouble by the above method. I don't install pyaudio because I don't fix the sound generation, but I installed and confirmed pykakasi. Now it works when I remove the text2speak () function call in qa_conversation_nano.py.
The app eventually wants to have a voice conversation. So, I wanted to use speech2text and text2speech if they existed, but I decided to use this because AWS only supports text2speech in Japanese.
AWS has a lot of explanations, but it seems that you need to get used to getting to the point. So I found the following code. If you use this, the text entered in Text will be converted to an audio file, and the mp3 file will be placed in the specified Dir of the ec2 server. Here, enter the key generated when the above instance was created on the right side of each of aws_access_key_id = and aws_secret_access_key = ,.
import boto3
polly_client = boto3.Session(
aws_access_key_id=,
aws_secret_access_key=,
region_name='us-west-2').client('polly')
response = polly_client.synthesize_speech(VoiceId='Joanna',
OutputFormat='mp3',
Text = 'This is a sample text to be synthesized.')
file = open('speech.mp3', 'wb')
file.write(response['AudioStream'].read())
file.close()
For Japanese, set VoiceId ='Mizuki' and put Japanese in Text. It seems that you can make various adjustments, but this time it is the default due to time constraints. Amazon Polly Voice
I was able to do it with the following code. Actually, there was a problem here. In other words, I got an error that the keys are different. This was because authentication was occurring during this transfer and it automatically went to look at the confidential file and read the old key. So, this time I deleted this file and uploaded it normally.
# -*- coding: utf-8 -*-
import boto3
s3 = boto3.resource('s3') #Get S3 object
bucket = s3.Bucket('s3 bucket-name')
bucket.upload_file('Specify ec2 mp3 file', 'dir specification of bucket of s3')
s3_client = boto3.client('s3')
# Upload the file to S3
s3_client.upload_file('test.txt', 'bucket-name', 'test-remote.txt')
At this point, you can generate audio with normal html.
<html>
<body>
<figure>
<figcaption>Listen to the Answer:</figcaption>
<audio
autoplay
controls
src="https://******.s3-****.amazonaws.com/mp3/speech0.mp3">
Your browser does not support the
<code>audio</code> element.
</audio>
</figure>
</body>
</html>
・ I was able to make my AWS debut by participating in the "Own AWS Hackathon (tentative)" ・ I tried playing the conversation app on AWS ・ Text2speech was unclear
・ Speech2text did not support Japanese on AWS, so it is necessary to consider countermeasures. ・ Conversation on a smartphone
Recommended Posts