[PYTHON] Until you get a snapshot of Amazon Elasticsearch service and restore it

When I took a snapshot of Amazon Elasticsearch Service (hereinafter referred to as es), I got stuck around the authority, so I will summarize it. The usage environment is Mac.

【background】

――How can you manually take a snapshot of es? ――It seems a little complicated around the authority, so let's touch it once ――Let's measure how long it will take to get and restore

I started to move my hands with such a sense of challenge.

[What I actually did]

-Refer to Official Document -Refer to [Amazon Elasticsearch Service] How to restore from manual snapshot (this article) I really referred to) --Give permissions in Kibana

I made it happen by such a procedure! !! !!

Manually take a snapshot to S3

In order to create a snapshot manually, the following measures are required as a prerequisite.

--Creating a bucket for s3 --Organize IAM roles and policies (← I had a hard time here) --Authentication with Kibana --Register a repository for manual snapshots --Get a manual snapshot --Check the repository from which the snapshot was taken --Check the snapshot acquisition status --Restore snapshot --Delete the index once --Restore the specified index

The script written this time is written in python, but it can also be written in Java, Ruby, Node, Go. Please refer to this Official Document

Then I would like to explain in order.

Creating an S3 bucket

Manually create a bucket to store snapshots. This time I created arn: aws: s3 ::: test-es-snapshot on the S3 console. This is needed in the following two places

--In the Resource Statement of the IAM policy attached to the IAM role --In the python payload used to register the snapshot repository

Organize IAM roles and policies

There are three places to check this time

1. Create an ES service role (arn: aws: iam :: username: role / es_s3_role).

Edit the trust relationship as follows What the principal specifies is "who will take on this role" The following Service indicates es, which means "the created service role takes on es" (I think)

Described in the trust relationship of the service role


{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "",
    "Effect": "Allow",
    "Principal": {
      "Service": "es.amazonaws.com"
    },
    "Action": "sts:AssumeRole"
  }]
}

2. Attach a policy for accessing S3 and retrieving data to the service role

Create a management policy called create-es-backup-policy with the following contents and attach it to the service role created above.

create-es-backup-policy(Attach to the created service role)


{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "s3:ListBucket"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::test-es-snapshot"
      ]
    }, 
    {
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::test-es-snapshot/*"
      ]
    }
  ]
}

3. Allow access to IAM roles on Kibana side

In Kibana, select ** Security **, ** Role Mappings **, ** Add **. For ** Role **, select ** manage_snapshots *. Then specify the ARN of your IAM user or IAM role in the appropriate field. Enter the user's ARN in the ** [Users] ** section. Enter the ARN of the role in the ** [Backend roles] ** section. ( This grants permission for role information to use snapshots) This time, I entered the ARN information of the created service role in the place called Backend Roles on the page called Role Mappings of Kibana. (I should be able to set the authentication information by operating the IAM policy, but I got stuck with an error, so I avoided it with this method)

This is the end of organizing around the IAM roll.

Register a repository for manual snapshots

Execute the following python script (Authentication with access key was performed in the reference material, but authentication with access key is not currently recommended from the official)

register-repositry.py


import boto3
import requests
from requests_aws4auth import AWS4Auth

region = 'Region name' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
host = 'es domain endpoint/' # /Don't forget

# Register repository
path = '_snapshot/test-es-snapshot' # the Elasticsearch API endpoint
url = host + path
payload = {
  "type": "s3",
  "settings": {
    "bucket": "Bucket name",
    "region": "Region name",
    "role_arn": "Service role name created"
  }
}

headers = {"Content-Type": "application/json"}

r = requests.put(url, auth=awsauth, json=payload, headers=headers)

print(r.status_code)
print(r.text)

Confirm that the response is as follows

200
{"acknowledged":true}

Take a manual snapshot

snapshot.py


import boto3
import requests
from requests_aws4auth import AWS4Auth

region = 'Region name' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
host = 'es domain endpoint/' # /Don't forget

path = '_snapshot/test-es-snapshot/my-snapshot-1' # the Elasticsearch API endpoint
url = host + path

r = requests.put(url, auth=awsauth)

print(r.text)

Execution example


-> % python snapshot.py
Enter MFA code for :
{"accepted":true}

Check the repository from which the snapshot was taken

check_repository.py


import boto3
import requests
from requests_aws4auth import AWS4Auth

region = 'Region name' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
host = 'es domain endpoint/' # /Don't forget

path = '_snapshot/?pretty'
url = host + path

r = requests.get(url, auth=awsauth)

print(r.text)

Execution example


-> % python check_repository.py
Enter MFA code for :
{
  "cs-automated-enc" : {
    "type" : "s3"
  },
  "test-es-snapshot" : {
    "type" : "s3",
    "settings" : {
      "bucket" : "test-es-snapshot",
      "region" : "Region name",
      "role_arn" : "Service role name created"
    }
  }
}

Check the snapshot acquisition status

check_snapshot.py


import boto3
import requests
from requests_aws4auth import AWS4Auth

region = 'Region name' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
host = 'es domain endpoint/' # /Don't forget

path = '_snapshot/test-es-snapshot/_all?pretty'
url = host + path

r = requests.get(url, auth=awsauth)

print(r.text)

Execution example


-> % python check_snapshot.py
Enter MFA code for :
{
  "snapshots" : [ {
    "snapshot" : "my-snapshot-1",
    "uuid" : "*************",
    "version_id" : ******,
    "version" : "7.8.0",
    "indices" : [ "index name ①", "index name ②",・ ・ ・ ・ ・ ・],
    "include_global_state" : true,
    "state" : "SUCCESS",
    "start_time" : "2020-11-17T07:20:17.265Z",
    "start_time_in_millis" : 1605597617265,
    "end_time" : "2020-11-17T07:21:21.901Z",
    "end_time_in_millis" : 1605597681901,
    "duration_in_millis" : 64636,
    "failures" : [ ],
    "shards" : {
      "total" : 38,
      "failed" : 0,
      "successful" : 38
    }
  } ]
}

Restore snapshot

If there is an index with the same name at the time of restore, it cannot be restored, so you need to delete the index you want to restore once. The procedure for deleting the specified index once and restoring it is shown below.

Delete the index once

delete_index.py


import boto3
import requests
from requests_aws4auth import AWS4Auth

region = 'Region name' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
host = 'es domain endpoint/' # /Don't forget

# DELETE INDEX
path = 'Index name'
url = host + path

r = requests.delete(url, auth=awsauth)

print(r.text)

Execution example


-> % python delete_index.py
Enter MFA code for :
{"acknowledged":true}

In the es console, select the domain and go to the Index tab to see that the index has been deleted. You can also see on the dashboard that fewer documents are searchable.

Restore the specified index

restore_one.py


import boto3
import requests
from requests_aws4auth import AWS4Auth

region = 'Region name' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
host = 'es domain endpoint/' # /Don't forget

path = '_snapshot/test-es-snapshot/my-snapshot-1/_restore'
url = host + path
payload = {"indices": "Index name"}
headers = {"Content-Type": "application/json"}

r = requests.post(url, auth=awsauth, json=payload, headers=headers)

print(r.text)

Execution example


-> % python restore_one.py
Enter MFA code for :
{"accepted":true}

Summary

The reference material was very useful for getting and restoring snapshots. (Almost the same.) I had a lot of trouble organizing IAM roles and policies. .. .. It was tough if I couldn't sort out which policy was attached to which role. ..

I hope it will be helpful to anyone. ..

Recommended Posts

Until you get a snapshot of Amazon Elasticsearch service and restore it
Until you get daily data for multiple years of Japanese stocks and save it in a single CSV (Python)
Bulk deployment with CFn to take a manual snapshot of Elasticsearch Service with Lambda
Until you publish a web service on GCP while studying JQuery and Python
If you define a method in a Ruby class and define a method in it, it becomes a method of the original class.
Until you create a machine learning environment with Python on Windows 7 and run it
Get an image from a web page and resize it
Get a Python web page, character encode it, and display it
Get a global IP and export it to Google Spreadsheets
Send a request from AWS Lambda to Amazon Elasticsearch Service
It is surprisingly troublesome to get a list of the last login date and time of Workspaces
What is a dog? Volume of GET request and query parameters
Let's think about judgment of whether it is PDF and exception handling. Until you create your own exception handling
Get a list of camera parameters that can be set with cv2.VideoCapture and make it a dictionary type