[PYTHON] [Introduction to AWS] The first Lambda is Transcribe ♪

This is a record I tried as a reference for memorandum. I think the reference article is easy for anyone who knows it is neatly written, but it was difficult for AWS beginners. I would like to explain that area in a muddy manner. So what I did is finally the reference article. 【reference】 ・ S3 → Lambda → Transcribe → Create a transcription pipeline with S3

What i did

・ Create a bucket for (input) in 1 S3 ・ 2 Open Lambda ・ 3 Definition and meaning of Lambda function ・ 4 How to read CloudWatch Logs ・ Create a bucket for (output) in 5 S3 ・ 6 Lambda correction ・ 7 How to change the execution role ・ 8 Edit Lambda function ・ 9 How to check Transcription

・ Create a bucket for (input) in 1 S3 ・ Create a bucket for (output) in 5 S3

If you press the service on the upper left of AWS, all services will be displayed, and you can select various menus from here. Among them, if you select s3 of storage, you can jump to the page where you can create a bucket of s3, so create it there. I think it works even if all security is prohibited (objects are made public). Create input and output with appropriate bucket names.

・ 2 Open Lambda

View all services as above, this time open Lambda for computing. Function creation page opens If it doesn't open, click Create Function. Then, the page of the picture of reference ① will jump. Here, you can go to the next screen by selecting [Use Blueprint]-[s3-get-object-python]-[Settings].

・ 3 Definition and meaning of Lambda function

In the first place, the main premise is that the ** Lambda function is a serverless definition of a function that picks up some Trigger and operates, and it is a pay-as-you-go service that costs a fee for the complete operation. ** ** So we define a function with almost a single function.

Finally, let's define the behavior of the function. Function name; Anything that is unique looks good Role name; this must be unique. Even if you delete the function, it will not disappear, so you need to delete it separately. S3 trigger; input bucket name for input Enable trigger; check The following skeleton is spit out, but it turns out that this function is already working. That is, it can be seen that a log is recorded when something is placed (transferred) in the input bucket. Looking at the contents, it is as follows First, Lib is as follows

import json
import urllib.parse
import boto3

Get s3 object

print('Loading function')
s3 = boto3.client('s3')

The lambda_handler function performs the operation described in #. That is, it gets an object from event and shows its contents. bucket returns the s3 bucket name defined above. key returns the filename. Get the file name etc. placed in the bucket in response. It is returning that content type. Below Exception is the error routine.

def lambda_handler(event, context):
    #print("Received event: " + json.dumps(event, indent=2))
    # Get the object from the event and show its content type
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
    try:
        response = s3.get_object(Bucket=bucket, Key=key)
        print("CONTENT TYPE: " + response['ContentType'])
        return response['ContentType']
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e

・ 4 How to read CloudWatch Logs

When you create a function, you will have a function page. When you enter there, it is [Settings] [Access privileges] [Monitoring], and the code is written below. Select the [Monitoring]. And there are some graphs, below which you can see CloudWatch Logs Insights. Here, try arranging (transferring) something from ec2 etc. to the above input bucket. Then, the above Lambda function works, and this CloudWatch Logs shows the record of the movement every moment. Errors are also thrown. So, at least you can check that it is working.

・ 6 Lambda correction

Finally, make the Lambda function real. This time, I think you can just copy and paste the application in Reference ① and change the bucket names of input and output. Let's see what happens when the trigger actually comes in the try. Here, the most devised is that the TranscriptionJobName is generated for each time when the trigger is applied to keep it unique. Therefore, the generated json also knows the time. You can understand the following code because bucket is an input bucket and the file name is key. Then, the output file is output to the bucket defined by OutputBucketName.

        transcribe.start_transcription_job(
            TranscriptionJobName= datetime.datetime.now().strftime('%Y%m%d%H%M%S') + '_Transcription',
            LanguageCode='ja-JP',
            Media={
                'MediaFileUri': 'https://s3.ap-northeast-1.amazonaws.com/' + bucket + '/' + key
            },
            OutputBucketName='lamoutput'
        )

・ 7 How to change the execution role

Select [Access Rights]. The execution role appears. Clicking on it will show the permissions of the execution role and the Permissions Policy. here, AmazonS3FullAccess AmazonTranscribeFullAccess To add. To add it, click [Attach Policy] and enter the above in the search, so check the left and attach it to add it.

・ 8 Edit Lambda function

Same as 6, but modify the code below directly on the function page and save.

・ 9 How to check Transcription

In this state, if you place some mp3 file on the input, I think that the result of the Transcribe is the json file in the output. I think it will take some time, but it output faster than my own program. You can see the current situation by looking at the CloudWatch Logs above. By the way, in the output bucket, you may not be able to see it unless you reload it, so be careful. And if you publish the json file, you can easily download it. Also, when I simply opened it, the characters were garbled. When I opened it with Notepad, I could see the contents beautifully. Furthermore, I think it was beautiful to read the downloaded json file with pandas and check the Transcribed text. I thought about using the above function so far, but I gave up tonight because it seems to be difficult.

Summary

・ Lambda function debut

・ Since the conversion is fast, I would like to implement Polly and apply it to conversation apps. ・ Polly will try to make it in the same way

Recommended Posts

[Introduction to AWS] The first Lambda is Transcribe ♪
Get the region where AWS Lambda is running
The simplest AWS Lambda implementation
[Introduction to pytorch-lightning] First Lit ♬
[Introduction to Udemy Python 3 + Application] 58. Lambda
[AWS SAM] Introduction to Python version
[Introduction to AWS] I played with male and female voices with Polly and Transcribe ♪
[Introduction to Python] What is the difference between a list and a tuple?
Regularly post to Twitter using AWS lambda!
Probably the most straightforward introduction to TensorFlow
Is the number equivalent to an integer?
Connect to s3 with AWS Lambda Python
[Introduction to AWS] Text-Voice conversion and playing ♪
[Introduction to Python] What is the method of repeating with the continue statement?
Summary of how to write AWS Lambda
Why Docker is so popular. What is Docker in the first place? How to use
Feel free to turn Python using the library into an AWS Lambda function
[AWS; Introduction to Lambda] 2nd; Extract sentences from json file and save S3 ♬
[Introduction to AWS] I tried porting the conversation app and playing with text2speech @ AWS ♪
[AWS] Wordpress How to deal with "The response is not a correct JSON response"
[What is an algorithm? Introduction to Search Algorithm] ~ Python ~
Introduction to Python Let's prepare the development environment
How to use MkDocs for the first time
[Introduction to Python3 Day 20] Chapter 9 Unraveling the Web (9.1-9.4)
Introduction to Python with Atom (on the way)
[AWS / Lambda] How to load Python external library
Summary of studying Python to use AWS Lambda
[Introduction to Algorithm] Find the shortest path [Python3]
[Introduction to Udemy Python3 + Application] 9. First, print with print
[Introduction to Udemy Python 3 + Application] 54. What is Docstrings?
From the introduction of pyethapp to the execution of contract
Day 66 [Introduction to Kaggle] The easiest Titanic forecast
[Introduction to Python] Basic usage of lambda expressions
Try posting to Qiita for the first time
Introduction to MQTT (Introduction)
Introduction to Scrapy (1)
Introduction to Scrapy (3)
Introduction to Supervisor
Introduction to Tkinter 1: Introduction
Introduction to PyQt
Introduction to Scrapy (2)
[Introduction to Python] What is the recommended way to install pip, a package management system?
It was a life I wanted to OCR on AWS Lambda to locate the characters.
[Linux] Introduction to Linux
Introduction to Scrapy (4)
TensorFlow runtime Attribute Error: module'tensorflow' has no attribute'constant' is the first thing to doubt
Introduction to discord.py (2)
[Introduction to Python] What is the important "if __name__ =='__main__':" when dealing with modules?
Introduction to discord.py
[Introduction to Infectious Disease Models] What is the difference between the April epidemic and this epidemic? .. .. ‼
Let's use AWS Lambda to create a mechanism to notify slack when the value monitored by CloudWatch is exceeded on Python
Linux is something like that in the first place
[Introduction to Python] How to iterate with the range function?
Terraform configured to launch AWS Lambda from Amazon SQS
The first algorithm to learn with Python: FizzBuzz problem
How to set layer on Lambda using AWS SAM
I tried to get an AMI using AWS Lambda
Amazon SNS → AWS Lambda → Slack → AWS Chatbot to execute AWS commands
The first step to getting Blender available from Python
I want to AWS Lambda with Python on Mac!
[Introduction to Udemy Python3 + Application] 27. How to use the dictionary