Feel free to turn Python using the library into an AWS Lambda function

Task

It is convenient to use a library in Python to analyze and organize data. I often use Numpy, Matplotlib, Pandas, Seaborn.

After trial and error of various analyzes with Jupyter Notebook, if you decide that you should do this kind of processing, you will want to script it and run it automatically every day.

Feel free to run AWS Lambda to run scripts every hour. You don't have to maintain the server. Moreover, it is cheap.

But when I try to use the library with Python of AWS Lambda, I have to give Create a deploy package. not. Moreover, if you use a library that requires compilation such as Numpy, you will need an Amazon Linux environment, which is a hassle.

Feel free to launch a development environment on Cloud9

AWS Cloud9 can be used as a development environment by setting up a server on EC2 and accessing it from the browser of your PC. It may be good to run Amazon Linux with Docker, but it is easier to set up on the cloud. Moreover, if you haven't used it for a while, it seems that the EC2 instance will be suspended without permission, which is kind to your wallet.

Create a new environment from the AWS Cloud9 menu. There is nothing to be careful about, but for now, the only platforms you can choose are Amazon Linux or Ubuntu Server 18.04 LTS. Lambda's environment is becoming Amazon Linux 2, but it can't be helped. This time I chose Amazon Linux.

Prepare a lambda function with the Serverless application wizard

Basically, set up according to Documentation. When Cloud9 launches, click the "AWS Resources" tab on the far right to display the Lambda menu. When you press the "Create a new Lambda function" button, a wizard called "Create serverless application" will appear. This time I will make it with the name envtest. Unfortunately, Python 3.6 was the only Python that could be selected as the Runtime.

When the wizard finishes, you should have a folder called envtest.

Before writing the code, let's install the required libraries. envtest/venv/ Python's Virtual Environment is available in. From the Cloud9 IDE console

source ./envtest/venv/bin/activate

After activating by typing, pip will install more and more required libraries. AWS Lambda has a limit of 250MB for deploy package size including libraries. Install only the minimum required. This time Numpy, Matplotlib, Pandas, Seaborn, Pillow, boto3 I installed it, but it was okay (just barely).

Write a Lambda function

envtest/envtest/lambda_function.py I will write a Lambda function in.

This time, in order to grasp the trend of the number of people infected with coronavirus in Tokyo, we will read the CSV of the number of infected people in Tokyo published every day, plot the daily value and the 7-day moving average, and upload it to S3. did. I have a 320x240 pixel bitmap as shown in Adafruit's Pyportal (https://www.adafruit.com/product/4116).

2020-9-5 It looks like this now. covid19_tokyo (6).png

Click here for the latest

lambda_function.py



import base64
import io 
from datetime import datetime as dt

import boto3
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
sns.set(style="whitegrid")
from PIL import Image

#Acquisition of details of announcement of new coronavirus positive patients
URL = 'https://stopcovid19.metro.tokyo.lg.jp/data/130001_tokyo_covid19_patients.csv'

s3_client = boto3.client('s3')

def lambda_handler(event, context):
    df = pd.read_csv(URL)
    df['date'] = df['Published_date'].apply(lambda a: dt.strptime(a, '%Y-%m-%d'))
    df = pd.DataFrame(df['date'].value_counts(sort=False).sort_index())
    df['ma7'] = df.iloc[:,0].rolling(window=7).mean()
    
    #PyPortal is 320 x 240
    ppi = 227
    width = 320 / ppi
    height = 240 / ppi
    
    SMALL_SIZE = 4
    MEDIUM_SIZE = 10
    BIGGER_SIZE = 12
    
    plt.rc('font', size=SMALL_SIZE)          # controls default text sizes
    plt.rc('axes', titlesize=SMALL_SIZE)     # fontsize of the axes title
    plt.rc('axes', labelsize=MEDIUM_SIZE)    # fontsize of the x and y labels
    plt.rc('xtick', labelsize=SMALL_SIZE)    # fontsize of the tick labels
    plt.rc('ytick', labelsize=SMALL_SIZE)    # fontsize of the tick labels
    plt.rc('legend', fontsize=SMALL_SIZE)    # legend fontsize
    plt.rc('figure', titlesize=BIGGER_SIZE)  # fontsize of the figure title
    
    fig, ax = plt.subplots(figsize=(width,height), dpi=ppi)
    ax.plot(df['date'], color='g', label="Daily")
    ax.plot(df['ma7'], 'r', label="ma7")
    #ax.text(0, 1, "212", fontsize=4)
    ax.set_title("Tokyo COVID-19")
    #fig.legend()
    fig.autofmt_xdate()
    plt.tight_layout()    

    pic_IObytes = io.BytesIO()
    plt.savefig(pic_IObytes,  format='png')
    pic_IObytes.seek(0)
    
    im = Image.open(pic_IObytes)
    pic_IObytes_bmp = io.BytesIO()
    im.save(pic_IObytes_bmp, format='bmp')
    pic_IObytes_bmp.seek(0)
    #pic_hash = base64.b64encode(pic_IObytes_bmp.read())

    s3_client.upload_fileobj(pic_IObytes_bmp, "pyportal", "covid19_tokyo.bmp", ExtraArgs={'ACL': 'public-read'})

    return ''

Code supplement

    df = pd.read_csv(URL)

With just this one line, it will download the CSV from the URL and make it Pandas.DataFrame. great.

    df['date'] = df['Published_date'].apply(lambda a: dt.strptime(a, '%Y-%m-%d'))
    df = pd.DataFrame(df['date'].value_counts(sort=False).sort_index())
    df['ma7'] = df.iloc[:,0].rolling(window=7).mean()

To make it easier to handle later, create a DateTime object column from the column containing the date as a character string, and redefine it as a DataFrame indexed. Also, calculate the 7-day moving average and create a new column.

    fig, ax = plt.subplots(figsize=(width,height), dpi=ppi)
    ax.plot(df['date'], color='g', label="Daily")
    ax.plot(df['ma7'], 'r', label="ma7")

Graph drawing with matplotlib. I want to make it 320x240pixel in the end, so set the dpi appropriately and set the width and height.

    pic_IObytes = io.BytesIO()
    plt.savefig(pic_IObytes,  format='png')
    pic_IObytes.seek(0)

Save the graph in PNG format in memory. Pyportal can only read bitmap format, so I wanted to save it as a bitmap. However, the backend of matplotlib I am using did not support saving in BMP format.

    im = Image.open(pic_IObytes)
    pic_IObytes_bmp = io.BytesIO()
    im.save(pic_IObytes_bmp, format='bmp')
    pic_IObytes_bmp.seek(0)

Since there is no help for it, once saved as PNG format, open it with Pillow and save it again as BMP format.

    s3_client.upload_fileobj(pic_IObytes_bmp, "pyportal", "covid19_tokyo.bmp", ExtraArgs={'ACL': 'public-read'})

Upload to S3 and publish.

Easy debugging of Lambda functions in Cloud9

Lambda functions are usually cumbersome to debug. Simple things like using only standard functions can still use the IDE on the AWS Lambda management screen, but if you use a lot of external libraries like this time, you can not do that either.

However, the Cloud9 IDE makes it easy to debug. You can also set a breakpoint and use the debugger. Thank you.

Deploy with the touch of a button

Debug with "Run local" and deploy if successful. Simply press the "Deploy the selected Lambda function" button from the Cloud9 IDE's Lambda menu. It will zip the environment and upload it to AWS Lambda.

Since this time we will upload to S3 at the end, we need to set the Amazon S3FullAccess policy for the role of this function from the AWS Lambda management screen.

If all goes well, set EventBridge (CloudWatch Events) as a trigger and set rate (1 day) to complete.

Bonus Launch Jupyter Notebook on Cloud9

The Cloud9 IDE isn't bad either, but there are times when I want to do a lot with Jupyter Notebook. Therefore, launch Jupyter Notebook on the server of Cloud9 so that you can develop with Jupyter Notebook from the browser of your PC or smartphone.

Keep EBS capacity large

You will probably run out of EBS volume because you will be preparing an environment for experimenting with Jupyter Notebook. Make it bigger according to the documentation. I kept it at 20 GiB.

Resize an Amazon EBS volume used by an environment

Like with miniconda

From the Cloud9 IDE console, set up miniconda and set up your favorite Python environment.

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

After that, you can create an environment as you like with conda.

Launch Jupyter Notebook

jupyter notebook --ip=0.0.0.0

When launching, specify 0.0.0.0 for the IP so that it can be accessed from the outside. Security is rugged, so at least set a password.

If you click "Go To Instance" in the EC2 Instance item on the AWS Cloud9 management screen, the management screen of the EC2 you are using will appear. You can access Jupyter Notebook by copying the URL in "Public DNS (IPv4)" to the clipboard and accessing it with a browser with ": 8888" and the port number specified at the end. Like this.

http://ec2-xx-xxx-xx-xx.ap-northeast-1.compute.amazonaws.com:8888/

Recommended Posts

Feel free to turn Python using the library into an AWS Lambda function
[AWS / Lambda] How to load Python external library
Load an external library from your Lambda function using AWS Lambda Layers. The Python environment for Amazon Linux 2 is also in place. (Python3.6, Anaconda)
I tried to get an AMI using AWS Lambda
[AWS] Try adding Python library to Layer with SAM + Lambda (Python)
Summary if using AWS Lambda (Python)
Write AWS Lambda function in Python
[Python] How to import the library
[Python] Make the function a lambda function
Feel free to encrypt the disk
Feel free to change the label of the legend in Seaborn in python
[Python] Explains how to use the format function with an example
[For Python] Quickly create an upload file to AWS Lambda Layer
[AWS] Using ini files with Lambda [Python]
Install python library on Lambda using [/ tmp]
Lambda function to take AMI backup (python)
Connect to s3 with AWS Lambda Python
[Circuit x Python] How to find the transfer function of a circuit using Lcapy
I tried using the Python library "pykakasi" that can convert kanji to romaji.
[Python] Smasher tried to make the video loading process a function using a generator
Reinventing the Wheel: libMerge to merge function definitions in Bash library into ShellScript
To automatically send an email with an attachment using the Gmail API in Python
[Hyperledger Iroha] Create an account using Python library
How to use the C library in Python
Aggregate test results using the QualityForward Python library
[Introduction to AWS] The first Lambda is Transcribe ♪
Manage AWS nicely with the Python library Boto
Summary of studying Python to use AWS Lambda
An article summarizing the pitfalls addicted to python
How to unit test a function containing the current time using freezegun in python
Try running a Schedule to start and stop an instance on AWS Lambda (Python)
A little trick to know when writing a Twilio application using Python on AWS Lambda
[AWS IoT] Register things in AWS IoT using the AWS IoT Python SDK
[python] option to turn off the output of click.progressbar
I wanted to use the Python library from MATLAB
Check types_map when using mimetypes on AWS Lambda (Python)
How to set layer on Lambda using AWS SAM
[Python] How to use the graph creation library Altair
I tried to approximate the sin function using chainer
Deploy Python3 function with Serverless Framework on AWS Lambda
How to turn a .py file into an .exe file
Write data to KINTONE using the Python requests module
I want to AWS Lambda with Python on Mac!
Specifies the function to execute when the python program ends
[Python] Mask the image into a circle using Pillow
[Introduction to Python] How to stop the loop using break?
Using the National Diet Library Search API in Python
[Introduction to Python] Basic usage of the library matplotlib
Securely deploy your Lambda function using Python built with the same options as Amazon Linux
How to get the information of organizations, Cost Explorer of another AWS account with Lambda (python)
Try to get the function list of Python> os package
I tried using the Python library from Ruby with PyCall
How to debug the Python standard library in Visual Studio
[python] How to use the library Matplotlib for drawing graphs
Try to operate an Excel file using Python (Pandas / XlsxWriter) ①
Try to operate an Excel file using Python (Pandas / XlsxWriter) ②
Paste the image into an excel file using Python's openpyxl
To return char * in a callback function using ctypes in Python
Connect to the Bitcoin network using pycoin (Python Cryptocoin Utili)
I tried to implement the mail sending function in Python
Building an environment to execute python programs on AWS EC2