[PYTHON] A story that I fixed when I got Lambda logs from Cloudwatch Logs

It's the 21st day of cloudpack Ara Convenient Calendar 2017!


I was using Python to get Lambda logs from Cloudwatch Logs. When I tried to get it recently, I got an error, so I tried to fix it a little.


Story about half a year ago

Previously, I was thinking of transferring Lambda logs in Cloudwatch Logs to a server (EC2) for business and investigating various things. Get logs from CloudWatch Logs (I tried to specify the period) I made it with reference to and used it to investigate the system under development. This was very helpful. Thank you very much! We would like to take this opportunity to thank you.

Problem occurs after using it after a long time

I haven't used it for a while

From the user

Get a log of this time of the day for a moment

That said, it's a hassle to take it all at once from the console. Do you use the one you made before? I tried using it after a long time. .. .. It never ends. Even if you specify 1 minute, it does not end. Why not? !! I tried inserting the log and looking it up. .. ..

** Trying to take from the stream for the whole period. .. .. ** **

The system has been in operation for about 3 months, and with some logs, It's a swept log of the whole period, which takes time. It was in such a state.

Renovation

I couldn't get it as it is ... So I decided to fix it, researched various things, and finally looked at the SDK document. Of the method to get the log stream describe_log_streams As an argument of I noticed that I can specify the log Stream Name as Prefix. Boto 3 Documentation - describe_log_streams For Lambda logs, you can narrow down by specifying the date here! So, add a date to the run-time argument (focus on development speed), Implemented a modification to be added to the argument of describe_log_streams Run! ... It ended in a blink of an eye (although not so quickly).

Source after refurbishment

getCwLogs.py


#! /usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import os
import sys
import codecs
import boto3
import time
from datetime import datetime
from datetime import timedelta

sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
log_group = sys.argv[1]
date = datetime.now().strftime("%Y%m%d-%H%M%S")
#Output directory
log_directory_base ='./cwlogs'
slice = log_group[0:1]
if (slice == '/'):
    log_directory = log_directory_base + "%s/%s" % (log_group, date)
else:
    log_directory = log_directory_base + "/%s/%s" % (log_group, date)

def validate():
    if(len(sys.argv) != 5):
        print '[ ERROR ] Few arguments.'
        print 'Please specify the argument.'
        print 'First: LogGroupName, Second: StartTime, Third: EndTime, Fourth: profileName., Fifth: logPrefixName'
        sys.exit()
    try:
        epoc = datetime(1970, 1, 1)
        global starttime
        starttime  = int((datetime.strptime(sys.argv[2], '%Y-%m-%dT%H:%M:%S') - epoc).total_seconds()) * 1000
        global endtime
        endtime  = int((datetime.strptime(sys.argv[3], '%Y-%m-%dT%H:%M:%S') - epoc).total_seconds()) * 1000
        durationtime = int(endtime - starttime)
        print 'durationtime "%s"' % durationtime
        if 86400000 < durationtime:
            print '[ ERROR ] Too long duration.'
            print 'Max duration is 1 day.'
            sys.exit()
        global log_stream_name_prefix
        log_stream_name_prefix = sys.argv[4]
    except ValueError:
        print '[ ERROR ] Failed StartTime or EndTime format.'
        print 'Date Format is "%Y-%m-%dT%H:%M:%S".'
        sys.exit()

def find_streams(token, log_stream_name_prefix):
    def is_valid_stream(stream):
        return starttime < stream.get('lastIngestionTime')
    try:
        if token is None:
            data =  cwlogs.describe_log_streams(logGroupName = log_group, logStreamNamePrefix = log_stream_name_prefix)
        else:
            data =  cwlogs.describe_log_streams(logGroupName = log_group, logStreamNamePrefix = log_stream_name_prefix, nextToken = token)

    except ValueError:
        print '[ ERROR ] Failed LogGroupName is invalid.'
        print '"%s" is not found.' % log_group
        sys.exit()

    streams = filter(is_valid_stream, data['logStreams'])
    if 'nextToken' not in data:
        return streams
    time.sleep(0.5)
    streams.extend(find_streams(data['nextToken'], log_stream_name_prefix))
    return streams

def find_events(token, last_token, stream):
    print 'logGroupName = "%s", logStreamName = "%s", startTime = "%s", endTime = "%s"' % (log_group, stream, starttime, endtime)
    if token is None:
        data = cwlogs.get_log_events(logGroupName = log_group, logStreamName = stream, startTime = starttime, endTime = endtime, startFromHead = True)
    else:
        data = cwlogs.get_log_events(logGroupName = log_group, logStreamName = stream, startTime = starttime, endTime = endtime, startFromHead = True, nextToken = token)

    events = data['events']
    if events:
        write_logs(events, stream)
        del events[:]
    if data['nextForwardToken'] != last_token:
        time.sleep(0.5)
        find_events(data['nextForwardToken'], token, stream)

def write_logs(events, stream):
    streams = stream.split(']');
    if (len(streams) > 1):
        #Lambda log
        with codecs.open("%s/%s.log" % (log_directory, streams[1]), "a", "utf-8") as f:
            for e in events:
                f.write("%s\n" % e['message'])
    else:
        #Other logs
        with codecs.open("%s/%s.log" % (log_directory, streams[0]), "a", "utf-8") as f:
            for e in events:
                f.write("%s\n" % e['message'])

def main():
    validate()
    global cwlogs
    cwlogs = boto3.client('logs')
    streams = [stream['logStreamName'] for stream in find_streams(None, log_stream_name_prefix)]
    os.makedirs("%s/" %)
    print(log_directory)

    for stream in streams:
        find_events(None, None, stream)

if __name__ == '__main__':
    main()


cmd


#For Lambda
python getCWLogs.py log group name start date and time(UTC)End date and time(UTC)Log stream name first date
#Other than Lambda(API Gateway)in the case of
python getCWLogs.py log group name start date and time(UTC)End date and time(UTC)Log stream name

Current usage

I use it when I want to get Lambda logs for a certain period of time, search with grep, or aggregate with Excel (it will become).


I wrote it for a while, but Is it useful for other people? It was a typical story.

Recommended Posts

A story that I fixed when I got Lambda logs from Cloudwatch Logs
A story that I was addicted to calling Lambda from AWS Lambda.
A story I was addicted to when inserting from Python to a PostgreSQL table
A story that suffered from OS differences when trying to implement a dissertation
A story that I was addicted to when I made SFTP communication with python
A story that was convenient when I tried using the python ip address module
A story that got stuck when trying to upgrade the Python version on GCE
Lambda Function (python version) that decompresses and outputs elements to CloudWatch Logs when a compressed file is uploaded to s3
I got a UnicodeDecodeError when pip install on ubuntu
A story that I was addicted to at np.where
I got a sqlite3.OperationalError
A story that went missing when I specified a path starting with a tilde (~) in python open
python Condition extraction from a list that I often forget
A story that stumbled when using pip in a proxy environment
[Python] Regularly export from CloudWatch Logs to S3 with Lambda
A story that didn't work when I tried to log in with the Python requests module
Move CloudWatch logs to S3 on a regular basis with Lambda
I got a TypeError:'int' object is not iterable when using keras
The story that a hash error came out when using Pipenv
I came across a lambda expression when I was worried about functionalization
I made a Discord bot in Python that translates when it reacts
I tried to develop a Formatter that outputs Python logs in JSON
I made a simple timer that can be started from the terminal
A reminder of what I got stuck when starting Atcoder with python
[Web server] A story when I investigated because I could not access nginx
I wrote a Slack bot that notifies delay information with AWS Lambda