Lambda automatically saves logs to CloudWatch Logs, but CloudWatch Logs has the ability to automatically parse, format, and search JSON-formatted logs. By all means, I would like to convert the log to JSON format.

However, all the articles found by google with "python lambda logging json" seemed to be incomplete, so this article will introduce the method that I think is the best.

Refusal

Since the standard library logging is the standard for Python log output, this article assumes logging.

Why you shouldn't use print ・ For the correct usage of logging, the article "Please stop printing and import logging for log output" is easy to understand. (Although the style is harsh)

This is (probably) best practice!

import logging
import json


#At the beginning, write your own formatter definition and root logger settings
class JsonFormatter:
    def format(self, record):
        return json.dumps(vars(record))

#Root logger settings
logging.basicConfig() #Set handler to output to standard error
logging.getLogger().handlers[0].setFormatter(JsonFormatter()) #Change the output format of the handler to your own

#After that, get logger normally and write a function for processing
logger = logging.getLogger(__name__)
logger.setLevel(os.environ.get('LOG_LEVEL', 'INFO')) #Digression: It is convenient to make the log level changeable with environment variables.


def lambda_handler(event: dict, context):
   #If you want to add additional information to the log,`extra=`Pass the dict to.
   #Of course, dict can be 2 or more elements
   logger.info("sample", extra={"foo": 12, "bar": "Hello World!"})

** Logs stored in CloudWatch Logs **: The output of logger.info is JSON. Note that logs such as START RequestId: ... are issued by AWS Lambda itself and cannot be changed.

START RequestId: 3ba9c9dd-0758-482e-8aa4-f5496fa49f04 Version: $LATEST
{
    "name": "lambda_function",
    "msg": "sample",
    "args": [],
    "levelname": "INFO",
    "levelno": 20,
    "pathname": "/var/task/lambda_function.py",
    "filename": "lambda_function.py",
    "module": "lambda_function",
    "exc_info": null,
    "exc_text": null,
    "stack_info": null,
    "lineno": 23,
    "funcName": "lambda_handler",
    "created": 1577152740.1250498,
    "msecs": 125.04982948303223,
    "relativeCreated": 64.58139419555664,
    "thread": 140315591210816,
    "threadName": "MainThread",
    "processName": "MainProcess",
    "process": 7,
    "foo": 12,
    "bar": "Hello World!",
    "aws_request_id": "3ba9c9dd-0758-482e-8aa4-f5496fa49f04"
}
END RequestId: 3ba9c9dd-0758-482e-8aa4-f5496fa49f04
REPORT RequestId: 3ba9c9dd-0758-482e-8aa4-f5496fa49f04	Duration: 1.76 ms	Billed Duration: 100 ms	Memory Size: 128 MB	Max Memory Used: 55 MB	Init Duration: 113.06 ms

Temporary commentary

(For details, see Official Document)

logging.Logger has a hierarchical structure, so if you want to change the overall format, you can change the root logger (logging.getLogger () ). logging.basicConfig () sets the handler for the root logger, so .setFormatter () sets your own formatter.

logging.LogRecord is passed to .format (record). Since this object has various information as attributes, it is acquired by vars (record) and converted to JSON.

The value passed in logger.info with ʻextra =is set as an attribute ofLogRecord. If you use keys such as msg, funcName` here, the original values will be overwritten. If you really want to avoid overwriting, you can also define your own logger that overrides .makeRecord. I can do it. However, I'm reluctant because the code gets complicated.

It ’s easy, is n’t it?

Should I use the library?

By the way, there are a number of libraries on Github that format logs into JSON.

However, I don't recommend using such a library. This is because introducing the library creates dependencies, which poses a hassle and security risk for version upgrades (remember the 2016 left-pad!).

As mentioned above, Json Formatter can be written in just 3 lines, so I think it's best to copy and paste the code after all.

Don't miss it.

Best practice for logging in JSON format on AWS Lambda / Python

Refusal

This is (probably) best practice!

Temporary commentary

Should I use the library?