Tuning knobs

Elasticsearch has several configuration points that may be changed for optimal performance and the allocation of resources. The following are some critical tuning knobs to think about.

Heap Size

The heap size is one of the most critical tuning parameters for Elasticsearch. It determines the amount of memory allocated to Elasticsearch's JVM and affects various operations, including caching, indexing, and search.
Note: It is recommended to allocate around 50%of available memory to the heap, up to a maximum of 30 GB. Elasticsearch uses native memory for various caches and buffers in intra- and inter-pod communications.
Thread Pools

Elasticsearch uses various thread pools for different operations, such as indexing, searching, and merging. You can tune the thread pool settings to control the number of threads allocated for each type of operation and adjust the queue size for pending requests. This can help balance the allocation of resources based on your specific workload.
For more information, see Thread pools.
Circuit Breakers

Circuit breakers protect Elasticsearch against excessive memory usage or disk space consumption. You can configure circuit breaker settings to control how Elasticsearch handles resource limitations and prevent out-of-memory or disk space errors.
For more information, see Circuit breaker settings.
Cache Size

Elasticsearch uses various caches, such as the field data and query cache, to improve search performance. You can adjust the cache size settings to optimize memory usage based on your query patterns and data size.
Field Data Cache

The Field Data Cache in Elasticsearch is a memory-based cache that stores the field values of indexed documents in a compressed and optimized format. It speeds up query execution by pre-loading frequently accessed field values into memory. By caching field data, Elasticsearch avoids loading data from a disk for each query, improving performance. It is a crucial optimization feature that can improve search performance, especially for aggregations, sorting, and scripting operations.
For more information, see Field data cache settings.
Field Data Loading Circuit Breaker

The field data loading circuit breaker protects against excessive memory usage by field data caches. You can configure the indices.breaker.fielddata.limit setting to control the memory allocated for field data caches and prevent out-of-memory errors.
For more information, see Field data circuit breaker.
Query Caching

Elasticsearch supports query caching, which can improve query performance by caching the results of frequently executed queries. To optimize performance, you can enable query caching and adjust cache settings, such as the cache size and expiration time.
For more information, see Node query cache settings.
Hardware Resources

Elasticsearch's performance is heavily influenced by the hardware resources available. Consider tuning the hardware configuration, such as CPU memory, disk I/O, and network settings, to match your workload requirements and optimize performance.
File Descriptors and Process Limits

Elasticsearch is a resource-intensive application requiring sufficient file descriptors and process limits to function correctly. You can increase the maximum number of open file descriptors and adjust process limits to accommodate the needs of your Elasticsearch cluster.
Indexing Buffer Settings

Elasticsearch uses memory buffers to stage data before it is written to disk. You can tune the indexing buffer sizes to optimize indexing performance. The indices.memory.index_buffer_size setting controls the size of the buffer for each shard.
For more information, see Indexing Buffer settings.
For more information, see Tune for indexing speed.
Query-Time Filters

Elasticsearch provides query-time filters that allow you to apply filters to a query. Using filters can improve query performance by reducing the amount of data that needs to be processed.
Refresh and Flush Intervals

Elasticsearch periodically refreshes its index to make new data searchable. You can adjust the refresh interval to balance indexing performance and search latency.

Similarly, the flush interval controls how often Elasticsearch writes data from memory to disk. Adjusting these intervals can impact indexing throughput and resource usage. An Elasticsearch flush performs a Lucene commit and starts a new translog generation. Flushes are performed automatically in the background to ensure the translog does not grow too large, which would make replaying its operations take considerable time during recovery.

For more information, see Index Modules.

For more information, see Translog.
Translog Durability
The translog is a transaction log that ensures data durability in case of node failures. Adjust the translog durability settings to balance data safety and indexing performance. For more information, see Translog.
When a document is indexed or updated in Elasticsearch, it is first written to the translog before being written to the index. This allows Elasticsearch to recover the changes in case of node failures or restarts. The translog acts as a buffer, storing the changes temporarily until they are flushed to disk and become part of the index.
1. Request Durability
  
  With request durability, every indexing or update request is synced to disk before a response is sent back to the client. This ensures that the changes are immediately durable but can impact performance due to the disk synchronization overhead.
2. Translog Durability Settings
  
  Elasticsearch provides configuration settings to control the durability of the translog. These settings include:
  
  translog.sync_interval
  
  Specifies the time interval between sync operations. Changes in the translog are periodically synced to disk based on this interval.
  
  translog.durability
  
  Controls the translog's durability level. It can be set to request, async, or request_sync to balance performance and durability.
By default, Elasticsearch uses the async durability mode, where the changes are periodically synced to disk but not necessarily after each request. This provides a good balance between durability and performance.

For more information, see Translog.
Aggregations

Elasticsearch provides powerful aggregation capabilities, but complex aggregations can impact performance. You can tune aggregation settings, such as search.max_buckets and indices.breaker.total.limit, to control the memory usage and limit the number of buckets aggregations produce.
For more information, see Aggregations.
Shard Size

Each shard in Elasticsearch comes with some overhead, so having a large number of small shards can impact performance. It is essential to balance the number of shards and the size of each shard based on your data volume and hardware resources.
Shard Allocation

Shards are the basic units of data distribution in Elasticsearch. By default, Elasticsearch tries to distribute shards evenly across nodes. However, you can control shard allocation settings to ensure balanced resource usage and optimize cluster performance.
Shard Routing

Elasticsearch distributes shards across nodes based on a hashing algorithm. You can influence shard routing by customizing the shard allocation process using shard allocation filters and allocation awareness settings. This can help balance data distribution and improve cluster performance.
Network Settings

Adjusting network settings, such as TCP configurations, can impact the performance and responsiveness of Elasticsearch. To ensure efficient network communication, you can optimize settings like TCP keep-alive, socket buffers, and connection timeouts.
Data Serialization and Compression

Elasticsearch allows configuring data serialization and compression options, such as using a more efficient binary format (like SMILE or CBOR) or enabling compression for network communication. These settings can improve storage efficiency and reduce network overhead.
See Save space and money with improved storage efficiency in Elasticsearch 7.10 for more information.
For more information, see Index Modules.

All these tuning knobs provide flexibility for optimizing Elasticsearch based on your specific workload, hardware resources, and performance requirements. It is essential to carefully monitor the impact of any changes and conduct performance testing to ensure optimal results. Additionally, always refer to the official Elasticsearch documentation and consider the recommendations provided by Elastic for tuning and optimization.