Using the Elasticsearch Curator

Use the Elasticsearch Curator to manage the Elasticsearch indices in Elastic Stack.

Elasticsearch Curator is a tool for managing your indices by periodically removing older data. By default, Curator will run daily and remove logs that are older than 5 days, but you can customize the settings.

Note: The data created by the Connections metrics and type-ahead search, and if configured, Orient Me Top Updates features are not purged by Curator because you need the historical data for those features to function properly.

How does Curator work?

Elasticsearch Curator runs as a Kubernetes Cron Job, which you can schedule for periodic execution. By default, Cron Job triggers Curator to run once a day at 00:01:00 (you can configure the schedule). When it runs, Curator purges logs from Elasticsearch, and then updates its indices. By default, the Job history retains one successful job and one failed job, so no logs are removed until the next time that the job runs. If a job fails or is suspended, it is not restarted

Curator uses filters to determine which logs to purge. The filters are joined by a logical AND operation, so all of the conditions must be satisfied for a particular log to be purged:
  • Logs must be associated with Component Pack for HCL Connections™ (the metrics, type-ahead search, and if configured, Orient Me Top Updates features are not affected)
  • Logs must be a specific age, which you can configure (defaults to 5 days).

You can customize the filters as explained in the sections that follow.

Note: After a helm del –purge operation, jobs and their associated pods will remain, even though the Cron Job itself is removed. This is because in Kubernetes 1.11, Cron Job has a propagation policy set to "orphan" so that dependent objects such as the Cron Job are not deleted.

Configuring the Kubernetes Cron Job schedule to purge logs

Curator jobs run based on the following Cron Job setting:

elasticsearch-curator.logging.elasticsearch.cronjobSchedule=schedule

By default, the job runs just past midnight, every day. To change the schedule, execute a Helm install or upgrade command that includes the Cron Job schedule. For example, the following Helm install sets the job to run hourly:


helm install \
--name=elasticstack extractedFolder/microservices_connections/hybridcloud/helmbuilds/elasticstack-0.1.0-20181014-210326.tgz \
--set \
global.onPrem=true,\
global.image.repository=DockerRegistry/connections,\
elasticsearch-curator.logging.elasticsearch.cronjobSchedule="0 * * * *"

For information on the Cron Job schedule syntax, see the Kubernetes Cron documentation; for example at https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/.

Configuring the logs age filter

Elasticsearch indices are based on data from event logs. The more logs you retain, the more data there is to search, and the longer each search takes. In addition, you will need more space in the Elasticsearch persistent volume to contain all the data. To limit search times (and required storage space), you should purge older logs periodically.

Curator jobs purge logs based on the following age filter:

elasticsearch-curator.logging.elasticsearch.daysToKeepLogs=logs_age

By default, each time the job runs, the filter purges logs that are older than 5 days. To change the logs age, execute a Helm install or upgrade command that includes the logs age filter. For example, the following Helm install sets the logs age to purge logs that are older than 30 days:


helm install \
--name=elasticstack extractedFolder/microservices_connections/hybridcloud/helmbuilds/elasticstack-0.1.0-20181014-210326.tgz \
--set \
global.onPrem=true,\
global.image.repository=DockerRegistry/connections,\
elasticsearch-curator.logging.elasticsearch.daysToKeepLogs="30"