[PYTHON] About CI / CD in Chalice x Circle CI environment

preface

https://chalice.readthedocs.io/en/latest/

AWS Chalice is a framework that allows you to build a Serverless environment as easily as Heroku by combining AWS Lambda and API Gateway. Currently, only Python is supported as the development language (it is unknown whether it will be supported in the future), but if you accept the restrictions, it seems that an environment that is easier and easier to develop than other Serverless frameworks is provided. I think. Even in my environment, I often use Chalice to develop a fairly wide range of processes such as simple API creation, batch processing, and cron processing.

I won't explain what you can do with Chalice in this article.

And since Mr. Sutajio wrote an article about the test (https://qiita.com/studio3104/items/8a6b7e5f696e8453d97a), which one is the CI / CD environment when developing with Chalice? I would like to write an article based on my own example. Here, the CI environment used is Circle CI.

There are several themes, but here we will describe the following themes.

--monorepo compatible

Supports monorepo (difference check for each project)

chalice is a style in which a project scaffold is created and developed in the form of chalice new-project sada for each function you want to create. If you create a repository such as GitHub for each project, it will become more fragmented. So I would like to have a monorepo configuration, but if I simply put multiple projects in one repository, all the projects will be tested, verified, and deployed when modifying some projects during CI / CD. It tends to run and slows down the CI / CD.

So, as a basic strategy, I would like to take the form of Implement CI / CD only for projects that have changed in the GitHub repository.

In realizing this, I referred to the contents of the following blog. A script that compares the contents of the Head on GitHub to determine if there are any differences. https://blog.hatappi.me/entry/2018/10/08/232625

$ cat tools/is_changed 
#!/bin/bash

if [ "${CIRCLE_BRANCH}" = "master" ]; then
  DIFF_TARGET="HEAD^ HEAD"
else
  DIFF_TARGET="origin/master"
fi

DIFF_FILES=(`git diff ${DIFF_TARGET} --name-only --relative=${1}`)

if [ ${#DIFF_FILES[@]} -eq 0 ]; then
  exit 1
else
  exit 0
fi

By preparing a script in this way and passing the Chalice project you want to check at the time of CI / CD to the argument of the script, you can check the difference for each project, skip the project without difference and follow only the one with difference It is possible to carry out the CI processing of.

$ cat tools/echo_changed 

for dir in ${PROJECTS}; do \
  if ${WORKDIR}/is_changed "${dir}"; then
    echo "changed."
  else
    echo "nothing change."
  fi
done

unit test / code coverage As for how to write the test, I think that Mr. Sutajio has described in detail, so I will omit the details here.

In my environment, I use a combination of coverage and pytest. We are conducting unit tests and measuring how much Test Coverage is achieved as a result.

$ coverage run -m pytest
$ coverage report
Name                Stmts   Miss  Cover
---------------------------------------
app.py                 35     22    37%
tests/__init__.py       0      0   100%
tests/conftest.py       4      0   100%
tests/test_app.py       5      0   100%
---------------------------------------
TOTAL                  44     22    50%

If you run coverage in an environment with unit tests, you can output a report like the one above. You can save the HTML report as Artifacts for Circle CI and view it from the CI results screen.

    - run:
        name: Run Tests
        command: |
          $HOME/.local/bin/coverage run -m pytest
          $HOME/.local/bin/coverage report
          $HOME/.local/bin/coverage html  # open htmlcov/index.html in a browser
    - store_artifacts:
        path: htmlcov

Please refer to the following official document for detailed settings.

https://circleci.com/docs/2.0/code-coverage/#python

Also, if you want to stop the test when the coverage rate falls below a certain coverage rate, you can do coverage report --fail-number [num] and it will return return code = 2 if the coverage is less than or equal to the number. .. This means that you can stop CI when the threshold is not met.

If you write the processing of this area as a shell, it will be as follows. Of all the projects in monorepo, if there is any difference, unitetest / coverage measurement is performed, and if the test fails or even one coverage is below the threshold, CI will stop.

$ cat tools/coverage 

ret=0
for dir in ${PROJECTS}; do 
  if ${WORKDIR}/is_changed "${dir}"; then
    cd ${WORKDIR}/../${dir}/ 
    sudo pip install -r requirements.txt
    coverage run -m pytest
    coverage report --fail-under=${THRESHOLD}
    r=$?
    coverage html -d ../htmlcov/${dir}
    if [ $r != 0 ]; then ret=$r; fi
  else
    echo "nothing change."
  fi
done

exit $ret

lint / code smells In addition to testing and coverage measurements, heuristically identifying ** “bad writing” ** using the Lint and Code Smells tools is also common in CI / CD flows.

I won't go into details about this either, but in my environment I will use a combination of pep8 (pycodestyle) and pyflakes. Both tools have return code = 1 when there is an issue, so handle the CI as needed.

The following is a sample result of pyflakes, but it is pointed out that in the following cases, the variable sada is declared but not used.

$ pyflakes app.py 
app.py:32: local variable 'sada' is assigned to but never used

However, in each case, if you carry out with the default settings, you will get points to be pointed out in quite detail, so it is a realistic operation to describe the configuration file according to the environment and ignore it, or describe the rules according to the environment I think.

If you write the processing of this area as a shell, it will be as follows.

$ cat tools/lint 

ret=0
for dir in ${PROJECTS}; do 
  if ${WORKDIR}/is_changed "${dir}"; then
    cd ${WORKDIR}/../${dir}/ 
    sudo pip install -r requirements.txt
    pep8 ${WORKDIR}/../${dir}/*.py --config ${WORKDIR}/../.config/pep8 
    pyflakes ${WORKDIR}/../${dir}/*.py 
    r=$?
    if [ $r != 0 ]; then ret=$r; fi
  else
    echo "nothing change."
  fi
done

exit $ret

chalice config file validation

With the above contents, I think that general CI can be implemented, but in the case of chalice, it is necessary to prepare a configuration file ** (config.json, deployed /) ** for each stage, and that setting I also want to verify the validity of the file.

The easiest way to do this is to simply run the chalice package command during CI / CD and see if you can wrap chalice properly to verify the validity of the configuration file.

$ cat tools/package 

for dir in ${PROJECTS}; do 
  if ${WORKDIR}/is_changed "${dir}"; then
    cd ${WORKDIR}/../${dir}/ 
    sudo pip install -r requirements.txt
    chalice package /tmp/${CIRCLE_SHA1}
  else
    echo "nothing change."
  fi
done

Vulnerability diagnosis

Nowadays, there are many sensitive cases about application security, so I don't want to let the code that is as disturbing as possible get into the application. Therefore, I would like to check each time at the CI timing to see if any of the libraries I am using contain vulnerabilities.

I think there are some solutions, but here we use the service of a security venture company called ** snyk ** (https://snyk.io/).

snyk is a SaaS that provides a function to perform vulunability chek (vulnerability diagnosis) of source code. It supports various languages, but of course Python used by Chalice is also supported. Basically, look at the contents of requirements.txt to understand the library used, and check whether the version of the library included in the vulnerability database is included.

Since it is easy to link with GitHub, it is easy to incorporate it into the CI flow. It's unlimited for public OSS, and you can use it for free up to 200 tests in your Private repository. https://snyk.io/plans/

Rather than writing the definition in circle.yml, snyk is in the form of linking with GitHub. I will leave the detailed setting method etc. to the official document. https://snyk.io/docs/github/

auto deploy Finally, about the CD part of CI / CD. Since chalice can be easily deployed with a command such as chalice deploy, it is easy to automatically deploy a source that has no problem as a result of CI.

The following is the process of chalice deploy to the latest state only for the project with the difference written in shell.

$ cat tools/deploy 

for dir in ${PROJECTS}; do \
  echo "${WORKDIR}/../${dir}/"
  if ${WORKDIR}/is_changed "${dir}"; then
    cd ${WORKDIR}/../${dir}/
    sudo pip install -r requirements.txt
    AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} chalice deploy --stage ${STAGE}
  else 
    echo "nothing change."
  fi
done

I think there is some controversy over whether to use CI tools for automatic deployment to production environments. (In my environment, I do not automatically deploy to the production environment, but often only the verification environment)

Instead of summarizing

AWS Chalice is a framework that allows you to easily write Serverless processing in the form of Heroku, although it is limited to Python, and I think it has the advantage of being able to write Lambda processing like a normal application. And because of its advantages, even a Serverless environment like Lambda can be easily incorporated into the CI / CD flow as described above.

So let's all use Chalice too. Happy Chalice Life and Serverless Life!!

Recommended Posts

About CI / CD in Chalice x Circle CI environment
Think about building a Python 3 environment in a Mac environment
Install LightGBM in an OS X virtualenv environment