[PYTHON] The first step to get rid of slow queries! I tried to notify Chatwork of slow queries for RDS for MySQL using Lambda and AWS CLI v2

Introduction

Hello. As an infrastructure manager at Mikatus Inc., I am in charge of infrastructure construction, operation and maintenance, etc. centered on AWS. This time I will write about notifying Chatwork of MySQL slow queries.

Our system uses Amazon RDS for MySQL, and we used a monitoring tool to monitor the occurrence of slow queries, but Chatwork so that we can make continuous improvements by notifying in real time. I tried to notify to. Also, the other day AWS CLI v2 preview version was announced, so it's a big deal, so using CLI v2 AWS I tried to create a resource.

About this article

This article provides steps to work with AWS Lambda settings and Amazon CloudWatch Logs settings using the AWS CLI v2. It does not cover the detailed steps of Amazon RDS log export function and Chatwork API settings.

Actual Chatwork notification content

This is what Chatwork will finally be notified of. The bot will post the content of the slow query to the group chat in which the parties concerned are participating. slowquery.png Basically, the content output to CloudWatch Logs is output as it is. Since the slow query log is output in UTC only for Time, it is converted to JST for notification.

Outline of processing

post-slowquery-to-chatwork.png The output from RDS to CloudWatch Logs is the RDS Log Export feature (https://docs.aws.amazon.com/ja_jp/AmazonRDS/latest/UserGuide/USER_LogAccess.Concepts.MySQL.html#USER_LogAccess.MySQLDB.PublishtoCloudWatchLogs ) Is used. CloudWatch Logs uses Subscription Filters (https://docs.aws.amazon.com/en_jp/AmazonCloudWatch/latest/logs/SubscriptionFilters.html#LambdaFunctionExample) to send to Lambda functions and written in Python. The function parses the log message and notifies Chatwork. You can control the slow query output to CloudWatch Logs with the long_query_time of the RDS parameter group, but the RDS setting is relatively loose and the threshold is set on the Lambda side so that it can be controlled. (This time, I set the RDS long_query_time to "5" seconds and the Lambda threshold to "10" seconds.)

Settings

I will describe the set contents.

Installing AWS CLI v2

Install AWS CLI v2 on your mac. Click here for installation procedure Please note that it is still a preview version, so please do not use it in a production environment.

AWS CLI version 2 is provided as a preview and evaluation of the test. At this time, we recommend that you do not use it in a production environment.

The following version has been installed.

$ aws2 --version
aws-cli/2.0.0dev2 Python/3.7.4 Darwin/17.7.0 botocore/2.0.0dev1

As an impression, CLI v1 has some Python version dependence, so it took some time to introduce it, but CLI v2 made it a lot easier to install without Python.

Create an IAM role

Set value

--Role name

Role creation

$ IAM_ROLE_POLICY_DOC="iam-role-policy.json"

#Set up Lambda as a trusted entity
$ cat <<EOF > ${IAM_ROLE_POLICY_DOC}
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF

$ IAM_ROLE_NAME="post-slowquery-to-chatwork-LambdaRole"
$ IAM_POLICY_ARN="arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"

#Create an IAM role
$ aws2 iam create-role \
    --role-name ${IAM_ROLE_NAME} \
    --assume-role-policy-document file://./${IAM_ROLE_POLICY_DOC}

#Get the ARN of the IAM role you created
$ IAM_ROLE_ARN=`aws2 iam get-role \
    --role-name ${IAM_ROLE_NAME} \
    --query 'Role.Arn' \
    --output text`

#Attach AWSLambdaBasicExecutionRole to your IAM role
$ aws2 iam attach-role-policy \
    --role-name ${IAM_ROLE_NAME} \
    --policy-arn ${IAM_POLICY_ARN}

Create a Lambda function

The runtime uses python3.7.

code

lambda_function.py


import json
import base64
import gzip
import re
import os
import datetime
import urllib
import urllib.request

LONG_QUERY_TIME = int(os.environ['LONG_QUERY_TIME'])

chatwork_endpoint = "https://api.chatwork.com/v2"
chatwork_apikey = os.environ['chatwork_apikey']
chatwork_roomid = os.environ['chatwork_roomid']
path = "/rooms/{0}/messages".format(chatwork_roomid)

def lambda_handler(event, context):
    print(json.dumps(event))
    #It is Base64-encoded and compressed in gzip format, so the contents are extracted.
    log_events = json.loads(gzip.decompress(base64.b64decode(event['awslogs']['data'])))
    db_name = log_events['logStream']
    message = log_events['logEvents'][0]['message']
    #Extract query time from log
    query_time = re.search(r'Query_time: ([0-9]+.[0-9]+)', message)

    #When it is larger than the time specified by the environment variable(10 seconds or more this time)Notify
    if LONG_QUERY_TIME < float(query_time.group(1)):
        timestamp = re.search(r'timestamp=([0-9]+);', message)
        #Convert timestamp to JST time
        date = datetime.datetime.fromtimestamp(int(timestamp.group(1))) + datetime.timedelta(hours=9)
        log_message = re.sub(r'# Time:.*\n', '# Time: %s(JST)\n' % str(date), message)
        post_to_chatwork({'body': '[info][title]%s[/title]%s[/info]' % (db_name, log_message)})

def post_to_chatwork(data=None):
    try:
        if data != None:
            data = urllib.parse.urlencode(data).encode('utf-8')
            headers = {"X-ChatWorkToken": chatwork_apikey}
            req = urllib.request.Request(chatwork_endpoint + path, data=data, headers=headers)
        with urllib.request.urlopen(req) as res:
            print(res.read().decode("utf-8"))
            return
    except urllib.error.HTTPError as e:
        print('Error code: %s' % (e.code))
        sys.exit('post_to_chatwork Error')

Environment variable

Lambda function creation

$ LAMBDA_FUNCTION_NAME="post-slowquery-to-chatwork"
$ LAMBDA_RUNTIME="python3.7"
$ LAMBDA_ZIP_FILE="${LAMBDA_FUNCTION_NAME}.zip"
$ LAMBDA_HANDLER="lambda_function.lambda_handler"
$ LAMBDA_ENV="{\
    LONG_QUERY_TIME=10,\
    chatwork_apikey=XXXXX,\
    chatwork_roomid=XXXXX}"

#Zip the code
$ zip ${LAMBDA_ZIP_FILE} lambda_function.py

#Create a Lambda function
$ aws2 lambda create-function \
    --function-name ${LAMBDA_FUNCTION_NAME} \
    --runtime ${LAMBDA_RUNTIME} \
    --zip-file fileb://${LAMBDA_ZIP_FILE} \
    --handler ${LAMBDA_HANDLER} \
    --environment Variables=${LAMBDA_ENV} \
    --role ${IAM_ROLE_ARN}

Set up CloudWatch Logs subscription filters

$ LOG_GROUP_NAME="/aws/rds/instance/[rdsinstance]/slowquery"
$ LOG_FILTER_NAME="LambdaStream_post-slowquery-to-chatwork"
$ LAMBDA_ARN=`aws2 lambda get-function \
    --function-name ${LAMBDA_FUNCTION_NAME} \
    --query 'Configuration.FunctionArn' \
    --output text`
$ LOG_ACTION="lambda:InvokeFunction"
$ LOG_PRINCIPAL="logs.ap-northeast-1.amazonaws.com"
$ SOURCE_ACCOUNT=`aws sts get-caller-identity \
    --query 'Account' \
    --output text`
$ SOURCE_ARN="arn:aws:logs:ap-northeast-1:${SOURCE_ACCOUNT}:log-group:${LOG_GROUP_NAME}:*"

#Give CloudWatch Logs access to execute functions
$ aws2 lambda add-permission \
    --function-name ${LAMBDA_FUNCTION_NAME} \
    --statement-id ${LAMBDA_FUNCTION_NAME} \
    --action ${LOG_ACTION} \
    --principal ${LOG_PRINCIPAL} \
    --source-arn ${SOURCE_ARN} \
    --source-account ${SOURCE_ACCOUNT}

#Create a subscription filter. The filter pattern is empty("")To
$ aws2 logs put-subscription-filter \
    --log-group-name ${LOG_GROUP_NAME} \
    --filter-name ${LOG_FILTER_NAME} \
    --filter-pattern "" \
    --destination-arn ${LAMBDA_ARN}

This completes the settings.

at the end

This time, I used RDS, Lambda, and CloudWatch Logs to notify slow queries. In the past, when a slow query occurred, it was a bit of a pain to go to CloudWatch Logs to see the log, so it seems that the pain will be alleviated by notifying Chatwork. In the future, I would like to make full use of log analysis tools etc. for more detailed visualization, but I will implement it at another opportunity. As for AWS CLI v2, I didn't enjoy much of the benefits within the scope of using it this time, but I will actively use it.

Recommended Posts

The first step to get rid of slow queries! I tried to notify Chatwork of slow queries for RDS for MySQL using Lambda and AWS CLI v2
I tried to notify the update of "Hamelin" using "Beautiful Soup" and "IFTTT"
I tried to get an AMI using AWS Lambda
I tried to notify the update of "Become a novelist" using "IFTTT" and "Become a novelist API"
I tried to get the index of the list using the enumerate function
I tried to create serverless batch processing for the first time with DynamoDB and Step Functions
I tried to notify Zabbix Server of execution error of AWS Lambda function
I tried to get the batting results of Hachinai using image processing
I tried to extract and illustrate the stage of the story using COTOHA
I tried using scrapy for the first time
I tried to summarize the settings for various databases of Django (MySQL, PostgreSQL)
I tried to automatically post to ChatWork at the time of deployment with fabric and ChatWork Api
I tried to get the number of days of the month holidays (Saturdays, Sundays, and holidays) with python
I tried to get the location information of Odakyu Bus
I tried to get Web information using "Requests" and "lxml"
I wrote AWS Lambda, and I was a little addicted to the default value of Python arguments
I became horror when I tried to detect the features of anime faces using PCA and NMF.
I tried to predict the up and down of the closing price of Gurunavi's stock price using TensorFlow (progress)
I tried to get a database of horse racing using Pandas
[Python] How to get the first and last days of the month
I tried to get a list of AMI Names using Boto3
I tried to get and analyze the statistical data of the new corona with Python: Data of Johns Hopkins University
Use AWS lambda to scrape the news and notify LINE of updates on a regular basis [python]
I tried to deliver mail from Node.js and Python using the mail delivery service (SendGrid) of IBM Cloud!
I tried to transform the face image using sparse_image_warp of TensorFlow Addons
I tried logistic regression analysis for the first time using Titanic data
I tried to visualize the age group and rate distribution of Atcoder
I tried to estimate the similarity of the question intent using gensim's Doc2Vec
I tried to get the authentication code of Qiita API with Python.
I tried to verify and analyze the acceleration of Python by Cython
I tried to get the RSS of the top song of the iTunes store automatically
I tried to get the movie information of TMDb API with Python
I tried the common story of using Deep Learning to predict the Nikkei 225
Using COTOHA, I tried to follow the emotional course of Run, Melos!
I tried to understand the learning function of neural networks carefully without using a machine learning library (first half).
I tried to get the information of the .aspx site that is paging using Selenium IDE as non-programming as possible.
I tried tensorflow for the first time
I tried to predict the deterioration of the lithium ion battery using the Qore SDK
[Python] I tried to judge the member image of the idol group using Keras
Python programming: I tried to get (crawling) news articles using Selenium and BeautifulSoup4.
I tried to process and transform the image and expand the data for machine learning
[Introduction to AWS] I tried porting the conversation app and playing with text2speech @ AWS ♪
I tried to automate the face hiding work of the coordination image for wear
I tried to use Twitter Scraper on AWS Lambda and it didn't work.