[PYTHON] Bulk deployment with CFn to take a manual snapshot of Elasticsearch Service with Lambda

Deploy the manual snapshot acquisition mechanism with Cloudformation at once with Amazon Elasticsearch Service (AES). And it is set to automatically delete old snapshots using Curator. It's surprisingly difficult just to take a snapshot.

The Python script is based on what is in this AWS documentation.

https://docs.aws.amazon.com/ja_jp/elasticsearch-service/latest/developerguide/curator.html

Thing you want to do

Make this. The script uses what is in the AWS documentation almost as it is, so I think that it is a level that you can understand immediately if you read it. There are also comments.

image.png

Need for manual snapshots

In July 2019, automatic snapshots were enhanced as follows, and hourly snapshots are now stored in S3 for 14 days.

https://aws.amazon.com/jp/about-aws/whats-new/2019/07/amazon-elasticsearch-service-increases-data-protection-with-automated-hourly-snapshots-at-no-extra-charge/

This alone is very convenient because even when data is lost due to a simultaneous failure of multiple nodes, the situation can be restored to the situation one hour before the failure.

However, automatic snapshots are taken into a managed S3 bucket, so everything is deleted when the domain is deleted.

There are also restrictions such as not being able to restore a snapshot to another domain, such as when you want to make a duplicate of a domain.

I think there are quite a few needs to use Kibana definitions and Index maps in different domains. For that purpose, it is good to take a manual snapshot in parallel with the automatic snapshot.

Although there is a reference URL

There is only a proven service, and there is official information about this area.

https://docs.aws.amazon.com/ja_jp/elasticsearch-service/latest/developerguide/curator.html

However, even if you copy this to Lambda as it is, it will not work. The library used in the Python sample script needs to be set separately so that it can be imported by Lambda.

In addition, IAM and other settings are required to perform manual snapshots. There is a manual here, but if you're not used to it, you won't know what to do.

https://docs.aws.amazon.com/ja_jp/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html

In this article, the goal is to create an environment in which this program (a slightly modified version) works.

Work premise

This is the premise of the operation performed in this article.

Work terminal side

At a minimum, a Linux or Mac machine with the following installed: (By the way, I'm working on wsl2)

AWS side

--Amazon Elasticsearch Service has been deployed with VPC access type

Execution method

Download from Github

Download the following repositories from Github.

$ git clone [email protected]:yomon8/aes-snapshot-sample.git
$ cd aes-snapshot-sample/

https://github.com/yomon8/aes-snapshot-sample

Configuration file adjustment

Enter the variables in the configuration file.

$ cp settings.sh{.sample,}
$ vim settings.sh

There are the following items, so set them according to your environment.

#Cloudformation Stack name
readonly STACK_NAME=es-snapshot-sample

#Bucket for uploading Cloudformation packages
readonly STACK_S3_BUCKET=bucket-name
readonly STACK_S3_PREFIX=es-snapshot-sample

#AES host name (AES is created on the assumption of VPC access)
readonly AES_HOST=vpc-xxxxxxxxxxxxxxxxxxxxxx.ap-northeast-1.es.amazonaws.com

#Specify Subnet that can access AES
readonly LAMBDA_SUBNET_ID=subnet-xxxxxxxxxxxxxxxxxxxxxx

#Specify SG that can access AES
readonly LAMBDA_SECURITY_GROUP_ID=sg-xxxxxxxxxxxxxxxxxxx

#The name of the AES snapshot storage location
readonly SNAPSHOT_REPOSITORY_NAME=your-repo

#Prefix with the name of the AES snapshot
readonly SNAPSHOT_PREFIX=your-snapshot

Cloudformation deployment

After that, hit the script for deployment and you're done. The first time it takes a long time to download the Layer software. It takes less than 10 minutes in my environment.

$ ./deploy.sh
~ Omitted ~

Register Snapshot repository

When I run deploy.sh, the following message is displayed at the end.

1.Execute the following command once at the beginning.
aws lambda invoke --function-name aes-snapshot-sample-AESRegistSnapshotRepositoryFun-1SHIY9NWSSH8B:live /dev/null

As the message says, when you execute the command, Lambda for repository registration will be executed.

$ aws lambda invoke --function-name aes-snapshot-sample-AESRegistSnapshotRepositoryFun-1SHIY9NWSSH8B:live /dev/null
{
    "StatusCode": 200,
    "ExecutedVersion": "1"
}

Let's check if the repository is registered. A repository called my-repo is registered.

GET _cat/repositories?v
id           type
cs-automated   s3
my-repo        s3

Take a backup

At this point, I just registered the repository, so there is no snapshot. Even if you hit the following API, the snapshot should not be displayed.

GET _cat/snapshots/my-repo

You should see another message in deploy.sh to take a snapshot.

2.To take a snapshot manually, run the following command.   
In addition, this Lambda is also set to schedule execution.
aws lambda invoke --function-name aes-snapshot-sample-AESRotateSnapshotFunction-1GB4JKXMNZ8KQ:live /dev/null

You can get a backup by executing the command displayed here.

$ aws lambda invoke --function-name aes-snapshot-sample-AESRotateSnapshotFunction-1GB4JKXMNZ8KQ:live /dev/null
{
    "StatusCode": 200,
    "ExecutedVersion": "1"
}
GET _cat/snapshots/my-repo?v
id                            status start_epoch start_time end_epoch  end_time duration indices successful_shards failed_shards total_shards
my-snapshot-2019-11-13-12-35 SUCCESS 1573647456  12:17:36   1573647456 12:17:36    459ms       1                 1             0            1

You can confirm that the snapshot data is also included in S3.

image.png

In addition, since the backup is set to start for a long time with CloudWatch Event, it will work once a day without executing the command as described above.

image.png

You can adjust the schedule in the following part of template.yaml. Other than the schedule, I just brought the script, so please adjust and use it.

      Events:
        Rule:
          Type: Schedule
          Properties:
            Schedule: cron(5 16 * * ? *)

What you are doing

You can see what you are doing by looking at the contents of deploy.sh.

elasticsearch and Curator are downloaded with Docker, built and uploaded as Lambda Layer.

#!/bin/bash

#Read configuration file
source ./settings.sh

#Download the required Python library as a Lambda Layer
docker run --rm -v $(pwd)/layer/python:/python python:3.7.5-alpine pip install -t /python requests-aws4auth elasticsearch elasticsearch-curator

#Deploy with Cloudformation
aws cloudformation package --template-file ./template.yaml --output-template-file ./package.yaml \
    --s3-bucket ${STACK_S3_BUCKET} \
    --s3-prefix ${STACK_S3_PREFIX} 
aws cloudformation deploy --template-file ./package.yaml --capabilities CAPABILITY_IAM \
    --stack-name ${STACK_NAME} \
    --parameter-overrides \
    AESHost=${AES_HOST} \
    LambdaSubnetId=${LAMBDA_SUBNET_ID} \
    LambdaSecurityGroupId=${LAMBDA_SECURITY_GROUP_ID} \
    SnapshotRepositoryName=${SNAPSHOT_REPOSITORY_NAME} \
    SnapshotPrefix=${SNAPSHOT_PREFIX} 

#Message display
echo ""
echo "1.Execute the following command once at the beginning."
echo aws lambda invoke --function-name $(aws cloudformation describe-stacks --stack-name ${STACK_NAME} --query 'Stacks[].Outputs[?OutputKey == `RegistSnapshotFunctionName`].OutputValue' --output text):live /dev/null
echo ""
echo "2.To take a snapshot manually, run the following command."
echo "In addition, this Lambda is also set to schedule execution."
echo aws lambda invoke --function-name $(aws cloudformation describe-stacks --stack-name ${STACK_NAME} --query 'Stacks[].Outputs[?OutputKey == `RotateSnapshotFunctionName`].OutputValue' --output text):live /dev/null
echo ""

Snapshot restoration

If you register S3 where snapshots are stored as a repository in the same way, you can easily restore it by following the procedure linked below.

https://docs.aws.amazon.com/ja_jp/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html#es-managedomains-snapshot-restore

reference

https://github.com/elastic/curator

[Tips Note] Restore other than Kibana index

Often, you want to restore something other than Kibana. You can restore non-Kibana indexes by doing the following: You can restore only a specific index by changing the parameters of indices.

POST _snapshot/Repository name/Snapshot name/_restore
{"indices": "*,-.kibana_1"}

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html

https://docs.aws.amazon.com/ja_jp/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html#es-managedomains-snapshot-restore

Recommended Posts

Bulk deployment with CFn to take a manual snapshot of Elasticsearch Service with Lambda
Send a request from AWS Lambda to Amazon Elasticsearch Service
Until you get a snapshot of Amazon Elasticsearch service and restore it
Export RDS snapshot to S3 with Lambda (Python)
A memo connected to HiveServer2 of EMR with python
Take a screenshot of the LCD with Python-LEGO Mindstorms
Input Zaim data to Amazon Elasticsearch Service with Logstash
I want to bind a local variable with lambda
A collection of competitive pro techniques to solve with Python
How to display a list of installable versions with pyenv