Backup Elasticsearch Indices in Component Pack

Last Update:

Author: Christoph Stoettner
Read in about 6 min · 1212 words

SD Cards

Photo by Markus Winkler | Unsplash

During a migration from Cognos Metrics to Elasticsearch Metrics, I had some issues with the index. So I wanted to create a backup of the already migrated data and start over from scratch.

The official documentation has an article on the topic: Backing up and restoring data for Elasticsearch-based components , but I had to slightly adjust the commands to get a successful snapshot.

Set default namespace for kubectl

I don’t want to type -n connections over and over again with each kubectl command, so I set connections as a default:

kubectl config set-context --current --namespace=connections

Register snapshot repository

Run this on your Kubernetes master or a machine configured for accessing the Kubernetes cluster with kubectl.

Open a shell in one o the es-client pods (works for ES 5 and 7):

kubectl exec -ti -n connections $(kubectl get pods -n connections |grep es-client |awk '{print $1}' |head -n 1) -- bash

Run the following commands in the container shell:

cd /opt/elasticsearch-${ES_VERSION}/probe/
./ PUT /_snapshot/connectionsmetrics \
  -H 'Content-Type: application/json' \
  -d '{"type": "fs","settings": {"compress": true,"location": "/backup"}}'

Response from server:

{ "acknowledged": true }

Check the settings of the backup repository:

./ GET /_snapshot/_all?pretty

Response from server:

  "connectionsmetrics": {
    "type": "fs",
    "settings": { "compress": "true", "location": "/backup" }

You can keep the container shell open, we will need it a little bit later again.

Get image tag and registry

Run this on your Kubernetes master or a machine configured for accessing the Kubernetes cluster with kubectl.

We have to use different commands for version 5 and 7, so I ran a short script to find out the deployed version, or just use es-data for version 5 and es-data-7 for version 7.

if [ $(kubectl get  pods | grep es-data | head -n 1 | awk -F'-' '$3 == "7" {print $3}') -eq "7" ]

You should see something similar to the following output:

estag=$(kubectl get statefulset es-data${version} -o=jsonpath='{$.spec.template.spec.containers[:1].image}' | awk -F: '{print $3}')
registry=$(kubectl get statefulset es-data${version} -o=jsonpath='{$.spec.template.spec.containers[:1].image}' | awk -F/ '{print $1}')

echo $estag

echo $registry

helm chart for backup and restore

I tried to use the helm chart delivered with Componentpack, but in the template files the path for version 5.5 is hard coded, the namespace is missing for the used image and I wanted to use the chart for version 7 and 5 of Elasticsearch.

So I rewrote the provided files, added the missing variables and if-else conditions. You can download the adjusted chart , I changed the name to esbackuprestore-0.1.1.tgz, so you can keep it in the same folder as the orginal file (esbackuprestore-0.1.0.tgz).

Create snapshot

Delete esbackuprestore helm deployment

If you already have used the helm chart, you need to delete esbackuprestore to run the install commands again.

Check if the chart is already installed:

helm list | grep esbackuprestore

If it is already deployed, the command returns:

esbackuprestore                   	connections	1       	2022-07-29 20:17:06.456765299 +0000 UTC	failed  	esbackuprestore-0.1.1

Then delete it:

helm delete esbackuprestore -n connections

No matter if you get an error here, I assume you never created an Elasticsearch backup within Componentpack.

Create snapshot

cd  extractedFolder/microservices_connections/hybridcloud/helmbuilds/
helm install esbackuprestore esbackuprestore-0.1.1.tgz \
     --set image.tag=$estag,elasticSearchBackup=true,image.repository=$registry,elasticVersion=$elasticVersion

To make more snapshots just run the command to delete and install again.

Restore snapshot

First we need to find out the snapshot name. To make this a little bit easier, download and install jq or gron . I prefer gron, the syntax is easier and you can grep within the results.

Get snapshot name with gron

# Version 5
kubectl exec -ti -n connections -c es-client \
  $(kubectl get pods -n connections | grep es-client | awk '{print $1}' | head -n 1) \
  -- /opt/elasticsearch-5.5.1/probe// GET /_snapshot/connectionsmetrics/_all \
  | grep snapshots | gron | grep "snapshot ="

# Version 7
kubectl exec -ti -n connections -c es-client \
  $(kubectl get pods -n connections |grep es-client |awk '{print $1}' |head -n 1) \
  -- /opt/elasticsearch-7.10.1/probe// GET /_snapshot/connectionsmetrics/_all \
  | grep snapshots | gron | grep "snapshot ="


json.snapshots[0].snapshot = "snapshot20220729182836";
json.snapshots[1].snapshot = "snapshot20220729183056";
json.snapshots[2].snapshot = "snapshot20220729183246";

Get snapshot name with jq

# Version 5
kubectl exec -ti -n connections -c es-client \
  $(kubectl get pods -n connections |grep es-client |awk '{print $1}' |head -n 1) \
  -- /opt/elasticsearch-5.5.1/probe/ GET /_snapshot/connectionsmetrics/_all \
  | grep snapshots | jq '.snapshots[] | .snapshot'

# Version 7
kubectl exec -ti -n connections -c es-client \
  $(kubectl get pods -n connections |grep es-client |awk '{print $1}' |head -n 1) \
  -- /opt/elasticsearch-7.10.1/probe/ GET /_snapshot/connectionsmetrics/_all \
  | grep snapshots | jq '.snapshots[] | .snapshot'



Restore command

Adding | tail -n 1 to the get name commands, shows the last created snapshot, copy the name and use with helm. Before running the helm command to need again delete the esbackuprestore application.

With the restore command we need REPONAME (Default: connectionsmetrics) and SNAPSHOTNAME.

helm delete esbackuprestore

helm install esbackuprestore esbackuprestore-0.1.1.tgz \
  --set image.tag=$estag,elasticSearchRestore=true,image.repository=$registry,namespace=connections,/

Delete snapshots

The documentation does not tell us how to delete snapshots, I have no experience how large they get, so if you ever have to remove older snapshots, open a shell in one of the es-data pods and run the command:

/opt/elasticsearch-${ES_VERSION}/probe/ DELETE /_snapshot/connectionsmetrics/snapshotname-to-delete

Do not use restore at the moment!!!

Problem here is, that the closes all indices, when it is started, but only opens them when the restore is successful.

So the whole process with these charts (mine, or the original one from HCL) can’t be recommended at the moment.

I think the backup is important, because Snapshot and restore tells us:

Taking a snapshot is the only reliable and supported way to back up a cluster. You cannot back up an Elasticsearch cluster by making copies of the data directories of its nodes. There are no supported methods to restore any data from a filesystem-level backup.

Elasticsearch is used to store Metrics and typeahead data, no content, but still valid data for us and our users.

Now we need to find a way to get a reliable way to restore these snapshots, if this is done, I will work on a way to backup mongoDB. The mongoDB is used with Activities Plus and afaik with Orient Me, but documentation does not show a way to backup.

Add a comment
There was an error sending your comment, please try again.
Thank you!
Your comment has been submitted and will be published once it has been approved.

Your email address will not be published. Required fields are marked with *

Suggested Reading
Card image cap

Last week I played around with the HCL Connections documentation to backup Elasticsearch in the article Backup Elasticsearch Indices in Component Pack .

In the end I found that I couldn’t get the snapshot restored and that I have to run a command outside of my Kubernetes cluster to get a snapshot on a daily basis. That’s not what I want.

Created: Read in about 4 min
Card image cap

Elasticsearch in HCL Connections Componentpack is secured with Searchguard and needs certificates to work properly. These certificates are generated by bootstrap during the initial container deployment with helm.

These certificates are valid for 10 years (chain_ca.pem) or 2 years (elasticsearch*.pem) and stored in the Kubernetes secrets elasticsearch-secret, elasticsearch-7-secret. So when your HCL Connections deployment is running for 2 years, the certficates stop working.

Created: Read in about 3 min
Card image cap
HCL Support published a collection of links to MustGather informations for Connections and addons. That’s the perfect starting point to start troubleshooting and collecting logs for your support cases. Collecting Data: Repository of MustGather for Connections
Created: Read in about 1 min