[PYTHON] Add a new issue to GitHub by email (Amazon SES utilization version)

Introduction

Previously, using IFTTT's Email service [^ iftttemail] as an email receiver, hit the GitHub API [^ githubapi] via AWS Lambda, and then Add a new issue to GitHub by email / items / ff516aa90eb87c5140e7) I made a function. It's actually very convenient to be able to create a GitHub issue with a single email when you notice a bug or improvement in your own service. I think I will continue to use it in the future, so I remade it so that it works on AWS, including the email receiver.

policy

Use Amazon SES (Simple Email Service) [^ ses] as the email receiver. By directing the mail delivery destination of the domain you manage to the receiving endpoint of SES, you can send mail to Amazon SES → Amazon S3 → AWS Lambda and bucket relay. I implemented a Lambda function that adds an issue to the GitHub repository by hitting the GitHub API [^ githubapi] according to the contents of the email, using the Python framework AWS Chalice [^ chalice].

The implementation procedure is roughly as follows.

  1. ** Domain management **: Set the SES email receiving endpoint in the mail delivery destination (MX record) [^ dns]
  2. ** S3 **: Set up an S3 bucket to store emails received by SES
  3. ** SES **: Create an inbound rule to store incoming emails in your S3 bucket
  4. ** GitHub **: Issue an access token ("repo" authorization) to use the GitHub API
  5. ** Lambda **: Implement and deploy a function that reads incoming mail from an S3 bucket and adds an issue to the GitHub repository
  6. ** S3 **: Set an event in the S3 bucket to execute the deployed Lambda function when saving received mail

For 1 to 3, see AWS Developer Guide "[Receiving Email Using Amazon SES-Amazon Simple Email Service](https://docs.aws.amazon.com/ja_jp/ses/latest/DeveloperGuide/receiving] -email.html) ”and support information“ [Receive and save emails on Amazon S3 using Amazon SES](https://aws.amazon.com/jp/premiumsupport/knowledge-center/ses- receive-inbound-emails /) ”is detailed. For 4, the specific procedure is explained in "How to set GitHub" Personal access tokens "--Qiita".

So, in this article, I'll summarize the implementations of 5 and 6 below.

Implementation

What you want to do in 5 and 6 above is, after all, when new incoming mail is saved in your S3 bucket, read the incoming mail from your S3 bucket and add an issue to your GitHub repository. This process, Chalice [^ chalice], a Python framework for Lambda-based development, can be achieved very easily using a decorator called ʻon_s3_event`.

Chalice.on_s3_event() S3 has a mechanism to skip notification to Lambda etc. when there is any change in the bucket. In order to use this mechanism, it is necessary to set an event to skip the notification in S3 and create a function to receive the notification in Lambda, but if you use Chalice, these settings will be done almost automatically.

The basic code that implements a Lambda function that receives an S3 event in Chalice is [^ on_s3_event].

app.py(sample)


from chalice import Chalice

app = chalice.Chalice(app_name='s3eventdemo')
app.debug = True

@app.on_s3_event(bucket='mybucket-name',
                 events=['s3:ObjectCreated:*'])
def handle_s3_event(event):
    app.log.debug("Received event for bucket: %s, key: %s",
                  event.bucket, event.key)

Chalice.on_s3_event () If you define a function with a decorator and write the code, when you deploy the function on Lambda with chalice deploy, all the roles and events for S3 and Lambda will be set automatically. I will.

code

So, this time, in this function with Chalice.on_s3_event () decorator, I described the process of reading the received mail from the S3 bucket [^ email] and adding the issue to the GitHub repository. The main code of Chalice, ʻapp.py`, is as follows.

app.py


from chalice import Chalice
import logging, os, json, re
import boto3
from botocore.exceptions import ClientError
import email
from email.header import decode_header
from email.utils import parsedate_to_datetime
import urllib.request


# setup chalice
app = Chalice(app_name='mail2issue')
app.debug = False

# setup logger
logger = logging.getLogger()
logger.setLevel(logging.INFO)
logformat = (
    '[%(levelname)s] %(asctime)s.%(msecs)dZ (%(aws_request_id)s) '
    '%(filename)s:%(funcName)s[%(lineno)d] %(message)s'
)
formatter = logging.Formatter(logformat, '%Y-%m-%dT%H:%M:%S')
for handler in logger.handlers:
    handler.setFormatter(formatter)


# on_s3_event
@app.on_s3_event(
    os.environ.get('BUCKET_NAME'),
    events = ['s3:ObjectCreated:*'],
    prefix = os.environ.get('BUCKET_KEY_PREFIX')
)
def receive_mail(event):
    logger.info('received key: {}'.format(event.key))

    # read S3 object (email message)
    obj = getS3Object(os.environ.get('BUCKET_NAME'), event.key)
    if obj is None:
        logger.warning('object not found!')
        return

    # read S3 object (config)
    config = getS3Object(os.environ.get('BUCKET_NAME'), 'mail2issue-config.json')
    if config is None:
        logger.warning('mail2issue-config.json not found!')
        return
    settings = json.loads(config)

    #Analyze email
    msg = email.message_from_bytes(obj)
    msg_from = get_header(msg, 'From')
    msg_subject = get_header(msg, 'Subject')
    msg_content = get_content(msg)

    #Extract email address
    pattern = "[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+"
    adds = re.findall(pattern, msg_from)
    #Extract settings corresponding to email addresses
    config = None
    for add in settings:
        if add in adds:
            config = settings[add]
            break
    if config is None:
        logger.info('there is no config for {}'.format(', '.join(adds)))
        return

    #Get repository
    repos = getRepositories(config['GITHUB_ACCESS_TOKEN'])
    logger.info('repositories: {}'.format(repos))

    #Judging the repository from the email title
    repo = config['GITHUB_DEFAULT_REPOSITORY']
    title = msg_subject
    spaceIdx = msg_subject.find(' ')
    if spaceIdx > 0:
        repo_tmp = msg_subject[0:spaceIdx]
        if repo_tmp in repos:
            title = msg_subject[spaceIdx+1:]
            repo = repo_tmp
    title = title.lstrip().rstrip()
    logger.info("repository: '{}'".format(repo))
    logger.info("title: '{}'".format(title))

    #POST issue
    postIssue(
        config['GITHUB_ACCESS_TOKEN'],
        config['GITHUB_OWNER'],
        repo, title, msg_content
    )

    #Delete email
    deleteS3Object(os.environ.get('BUCKET_NAME'), event.key)


#Get object from S3
def getS3Object(bucket, key):
    ret = None
    s3obj = None
    try:
        s3 = boto3.client('s3')
        s3obj = s3.get_object(
            Bucket = bucket,
            Key = key
        )
    except ClientError as e:
        logger.warning('S3 ClientError: {}'.format(e))
    if s3obj is not None:
        ret = s3obj['Body'].read()
    return ret

#Delete S3 object
def deleteS3Object(bucket, key):
    try:
        s3 = boto3.client('s3')
        s3.delete_object(
            Bucket = bucket,
            Key = key
        )
    except ClientError as e:
        logger.warning('S3 ClientError: {}'.format(e))


#Get email header
def get_header(msg, name):
    header = ''
    if msg[name]:
        for tup in decode_header(str(msg[name])):
            if type(tup[0]) is bytes:
                charset = tup[1]
                if charset:
                    header += tup[0].decode(tup[1])
                else:
                    header += tup[0].decode()
            elif type(tup[0]) is str:
                header += tup[0]
    return header

#Get email body
def get_content(msg):
    charset = msg.get_content_charset()
    payload = msg.get_payload(decode=True)
    try:
        if payload:
            if charset:
                return payload.decode(charset)
            else:
                return payload.decode()
        else:
            return ""
    except:
        return payload


#Get a list of github repositories
def getRepositories(token):
    req = urllib.request.Request(
        'https://api.github.com/user/repos',
        method = 'GET',
        headers = {
            'Authorization': 'token {}'.format(token)
        }
    )
    repos = []
    try:
        with urllib.request.urlopen(req) as res:
            for repo in json.loads(res.read().decode('utf-8')):
                repos.append(repo['name'])
    except Exception as e:
        logger.exception("urlopen error: %s", e)
    return set(repos)

#Add issue to github repository
def postIssue(token, owner, repository, title, content):
    req = urllib.request.Request(
        'https://api.github.com/repos/{}/{}/issues'.format(owner, repository),
        method = 'POST',
        headers = {
            'Content-Type': 'application/json',
            'Authorization': 'token {}'.format(token)
        },
        data = json.dumps({
            'title': title,
            'body': content
        }).encode('utf-8'),
    )
    try:
        with urllib.request.urlopen(req) as res:
            logger.info(res.read().decode("utf-8"))
    except Exception as e:
        logger.exception("urlopen error: %s", e)

The following configuration file is read from S3 so that the access token for using the GitHub API can be switched according to the sender's email address.

mail2issue-config.json


{
    "<Sender email address>": {
        "GITHUB_OWNER": "<GitHub username>",
        "GITHUB_ACCESS_TOKEN": "<GitHub access token>",
        "GITHUB_DEFAULT_REPOSITORY": "<Repository name if not specified in the email title>"
    },
    ...
}

in conclusion

If I happened to touch Amazon SES for another purpose and could receive emails on AWS, I came up with this refactoring. There are still many email-triggered services, so we will continue to consider applying this pattern.

Recommended Posts

Add a new issue to GitHub by email (Amazon SES utilization version)
[Morphological analysis] How to add a new dictionary to Mecab
Add a function to heat transfer + heat input by temperature to heatrapy