In this blog, we will walk through backing up our committed cluster data via Elasticsearch’s snapshots into an AWS S3 bucket. In Elastic Cloud (Enterprise), Elastic provides a built-in backup service under its found-snapshots repository. Elasticsearch also supports custom repositories for both Cloud and on-prem setups, connecting to data stores like AWS S3, GCP, and Azure for all platform types and also filesystem for on-prem. These built-in and custom snapshot repositories offer great options for data backups; custom repositories for longer term storage and on-off backups; and found snapshots for ongoing, recent backups. Users often integrate both methods into their production clusters.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/bltc5cc59e94bf349b2/673e83f2d57505f728c119dd/1.png,1.png
Under Create bucket, fill in the Bucket name and leave all other options at their defaults. Then, click Next to create this bucket to hold our data. For our example, the bucket name will be s3-custom-repository-bucket-demo.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/bltda8d7b7a05e420ae/673e83fac1b38af2f300e34b/2.png,2.png
Under the first step for Create policy called Specify permissions, we will copy Elastic Cloud’s recommended S3 permissions into the JSON “Policy editor” — only retaining the value AWS originally had for its “Version” JSON key. You may prefer further permission restrictions as outlined within Elasticsearch’s documentation. We will replace the guide’s JSON’s placeholder bucket-name under Resource with our bucket name s3-custom-repository-bucket-demo. Then, we will select Next to proceed to Step 2: Review and create.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt1e3eb9691353ba23/673e8400c630cf7bf6919cee/3.png,3.png
We will enter a Policy name and Description, then select Next. For our example, the policy name will be s3-custom-repository-demo-policy.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt89cb79b02cd2786f/673e8407ebebdcb4b30e3c5c/4.png,4.png
Here, we will attach the IAM policy to our user by selecting the Permissions Options value and Attach polices directly. Then, under Permissions policies, we will search and enable our IAM policy. Once done, we will leave all other options at their defaults and click Next to move onto Step 3: Review and create then scroll through and click Create user.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt8b894162454de90c/673e840eb5054cb08c323a7b/5.png,5.png
This directs us to the Create access key flow under Step 1: Access key best practices & alternatives.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/bltd808b1c1bed6ac16/673e8419bd749ab329f132ce/6.png,6.png
For the Use case, we will select Third-party service and then click Next. This takes us to Step 2 – optional: Set description tag which we’ll skip through by clicking Next again, bringing us to Step 3: Retrieve access keys.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt210f7733521341b5/673e84217ca3b533a741afa6/7.png,7.png
We will securely store our IAM user’s new access and secret keys.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/bltf37f97d1969e347e/673e842796fb4113dd103ea5/8.png,8.png
Under our deployment’s security tab, we will navigate to elasticsearch-keystore and click Add settings. In case there are multiple access and secret key pairs for separate S3 repository connections, the Elasticsearch S3 repository JSON maps our access and secret keys via a client string. Our IAM User’s access key will be the value of s3.client.CLIENT_NAME.access_key and secret key will be the value of s3.client.CLIENT_NAME.secret_key, where CLIENT_NAME is a placeholder for that S3 JSON mapping’s client value. Because the client defaults to default, we will use the same for our example, so our access and secret values to insert under Setting name will be stored under keys s3.client.default.access_key and s3.client.default.secret_key respectively.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/bltc1ce6b7894b62b18/673e842fd64c229a2fca290c/9.png,9.png
Once added, our keys will show under Security keys. For security, our keystore values cannot be viewed nor edited after adding — only removed to recreate.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt385ef245e53f61a5/673e8437ed994467b7a61ed7/10.png,10.png
A successful response will emit _nodes.failed: 0. Our access and secret keystore pair are now added into Elasticsearch, so we can now register our AWS S3 repository. We will then navigate to Snapshot and Restore under Stack Management and click into the Repositories tab, then select Register a Repository.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt9964a2fd18e0fade/673e843fa441bc2cf5c88f74/11.png,11.png
We will give our repository a Name and select a Repository Type of AWS S3. For our example, our repository name is aws_s3. Kindly note that while most Elasticsearch features like Allocation load data from the repository based on its stored uuid once initially registered, ILM searchable snapshots do use the repository name as an identifier. This will need to be lined up across Elasticsearch clusters when migrating searchable snapshot data.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt17bc7997802266af/673e8448a1f5a2255f04a59f/12.png,12.png
Under Register repository, add our Bucket name s3-custom-repository-demo-bucket, leave all other options at their defaults, and select Save. For our example, we will leave the Client empty in order to default to default to match our Elasticsearch keystore CLIENT_NAME. Kindly note that only one read-write connection from one Elasticsearch cluster should be acting on a repository at a time; as needed, make sure to flag readonly to avoid accidental data overwriting or corruption. This will take us to the aws_s3 repository overview UI drawer.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/bltb1fe18f23ca014fe/673e844e554dc3664ec6a64c/13.png,13.png
Here we can select Verify repository under Verification status to confirm that all nodes can connect to our AWS S3 bucket and pass initial verification checks. We can also run this same test from Dev Tools with verify snapshot repository.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt9b58bef09f4fdeac/673e8456a491d703dafd8f6d/14.png,14.png
Both of these outputs return the same list of nodes successfully connected to our AWS S3 bucket.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/bltaf62f2aa2ca2fa27/673e84a74ba87627a75aba72/15.png,15.png
Our example snapshot name is bats. The resulting snapshot reported state: SUCCESS. We can confirm results by navigating back to our AWS S3 bucket s3-custom-repository-demo-bucket which shows Elasticsearch added files and subfolders into our root directory.
https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/bltc12abd6ff97c3de7/673e84b546ab883081ba9e5b/16.png,16.png
We did it! Check out this video for a walkthrough of the steps above.
As desired at this point, we can set up snapshot lifecycle management to take period snapshots and manage snapshot retention. Alternatively, we could disconnect our AWS S3 repository to connect it to a different Elasticsearch cluster to migrate this newly snapshot data.
The release and timing of any features or functionality described in this post remain at Elastic’s sole discretion. Any features or functionality not currently available may not be delivered on time or at all.
Leave a Reply