Memory Considerations for Cloud native

Managing memory is one of the most important factors in maintaining stable and efficient applications in Kubernetes.

Unlike CPU, memory is a non-compressible resource—an application cannot be throttled when it needs more memory. If it exceeds its defined limit, Kubernetes terminates the container.

This section explains how to define, inspect, and configure memory for your pods, with a focus on JVM-based applications.

Defining and Inspecting Memory Settings

In Kubernetes, memory allocation for a container is controlled using two fields in the pod specification: requests and limits

  • resources.requests.memory
    • Definition: The minimum amount of memory the pod is guaranteed.

    • Usage: The Kubernetes scheduler uses this value to decide where to place the pod. It will only schedule the pod on a node with enough available memory.

  • resources.limits.memory
    • Definition: The maximum amount of memory the container is allowed to use.

    • Usage: If the container’s memory usage (Resident Set Size) exceeds this limit, it is terminated by the kernel with an OOMKill(Out of Memory Kill) event. You will see this as Exit Code 137.

Best Practice:

For critical, memory-sensitive applications such as JVM-based workloads, set requests.memory equal to limits.memory.

This assigns the pod a Guaranteed Quality of Service (QoS) class—ensuring it remains stable and is only terminated if it exceeds its own defined limit.

Relationship to Node Capacity and Replicas
  • Node Capacity: The scheduler will not place a pod on a node if the sum of the requests of all pods on that node (plus the new pod's request) exceeds the node's allocatable memory.
  • Max Desired Replicas (HPA): The requests value is fundamental to the Horizontal Pod Autoscaler (HPA), which is an option for the executor pod. The HPA scales based on the percentage of requests used. Setting an accurate request is essential for proper autoscaling.

Configuring Memory in the HCL Link values.yaml

In the HCL Link Helm chart, you configure these values for each component by modifying its resources block in your values.yaml file. You then apply these changes during Helm installation or upgrade (for example helm upgrade ... -f my-values.yaml).

The default settings for most components (client, server, rest, executor) already follow the Guaranteed QoS best practice, with requests set equal to limits.

Client (client):

This pod serves the user interface. Its memory is set in the client.resources block.
# values.yaml
client:
  ...
  resources:
    requests:
      memory: 4Gi
    limits:
      memory: 4Gi

Server (server)

This pod runs the core server logic. Its memory is set in the server.resources block.
# values.yaml
server:
  ...
  resources:
    requests:
      memory: 8Gi
    limits:
      memory: 8Gi

Rest (rest)

This pod runs the REST API, which is a JVM application. You must configure both the container's total memory limit (in resources) and the JVM's heap size (in jvmOptions).
# values.yaml
rest:
  ...
  # 1. Set the container's total memory limit
  resources:
    requests:
      memory: 8Gi
    limits:
      memory: 8Gi

  # 2. Set the JVM's heap size *within* that limit
  #    This example sets the heap to 40% of the 8Gi limit.
  #    See the next section for why.
  jvmOptions:
    - "-XX:InitialRAMPercentage=40.0"
    - "-XX:MaxRAMPercentage=40.0"
    - "-XX:NativeMemoryTracking=summary"

Executor (executor)

This pod runs the data transformation workloads. Its container limit is set in executor.resources. This pod is a special case (JVM + C++ native code) and is discussed in detail in the final section. The JVM settings for this pod are controlled by rest.workerJvmOptions.
# values.yaml
executor:
  ...
  resources:
    requests:
      memory: 8Gi
    limits:
      memory: 8Gi

Kafka Link (kafkaLink)

This component is configured by default as a Burstable QoS pod, meaning its requests are much lower than its limits.
# values.yaml
kafkaLink:
  ...
  resources:
    requests:
      cpu: 250m
      memory: 4Gi
    limits:
      cpu: 2000m
      memory: 4Gi

This configuration means the pod is guaranteed only 700Mi and can "burst" up to 4Gi if memory is available on the node. However, it will be one of the first pods terminated if the node experiences memory pressure.

Recommendation: For production stability, set the requests and limits to be equal.
# Recommended production setting for kafkaLink
kafkaLink:
  ...
  resources:
    requests:
      memory: 4Gi
    limits:
      memory: 4Gi

Resource sizing as per the data volume and parallel instances

Recommended Configuration for Single-Replica Executor Flows

Use the following configuration when you run a flow with a single executor replica and process 250k requests with a batch size of 500.
Pod Minimum CPU (Request) Minimum Memory (Request) Recommended CPU (Limit) Recommended Memory (Limit) JVM memory
lnk-executor 1000m (1 Core) 8Gi 2000m (2 Cores) 8Gi 2gb
lnk-rest 500m (0.5 Core) 8Gi 1000m (1 Core) 8Gi 2gb
lnk-server 250m (0.25 Core) 4Gi 500m (0.5 Core) 4Gi No setting
lnk-client 100m (0.1 Core) 500Mi 250m (0.25 Core) 1Gi No setting
link-mongodb 500m (0.5 Core) 2Gi 1000m (1 Core) 4Gi NA
link-redis 250m (0.25 Core) 1Gi 500m (0.5 Core) 2Gi NA
lnk-kafka-link 250m (0.25 Core) 700Mi 500m (0.5 Core) 1.5Gi NA

Recommended Configuration for Request Counts Between 250k and 1000k (Batch Size = 500)

Use the following configuration when the flow request count increases to the 250k–1000k range while keeping the batch size at 500.
Pod Minimum CPU (Request) Minimum Memory (Request) Recommended CPU (Limit) Recommended Memory (Limit) JVM Memeory
lnk-executor 7000m (7 Cores) 12-16Gi 8000m (8 Cores) 12Gi-16Gi 3gb
lnk-rest 2000m (2 Core) 8Gi 2000m (2 Core) 8Gi 3gb
lnk-server 2000m (2 Core) 8Gi 2000m (2 Core) 8Gi No setting
lnk-client 1000m (1 Core) 1Gi 1000m (1 Core) 1Gi No setting
link-mongodb 1000m (1 Core) 2Gi 1000m (1 Core) 4Gi NA
link-redis 1000m (1 Core) 1Gi 1000m (1 Core) 2Gi NA
lnk-kafka-link 250m (0.25 Core) 1.5Gi 500m (0.5 Core) 1.5Gi NA

JVM Memory: Heap, Non-Heap, and Container Limits

The Kubernetes memory limit (resources.limits.memory) applies to the entire container process—not just the JVM heap (-Xmx).

A JVM’s total memory footprint consists of:

  1. JVM Heap: Holds Java objects.
  2. Non-Heap Memory: Includes metaspace, thread stacks, code cache, and GC overhead.
  3. Native Memory: Used by C/C++ libraries through JNI.

If the combined memory usage of these three components exceeds the Kubernetes memory limit, the pod will be terminated with OOMKilled (Exit Code 137).

Best Practice: Set JVM Heap as a Percentage

Avoid hardcoding heap size (for example, -Xmx6g). Instead, use percentage-based flags so the JVM allocates heap memory based on the container’s total memory limit.

  • -XX:MaxRAMPercentage=40.0: Sets the maximum heap size (-Xmx) to 40% of the container’s memory limit, leaving around 25% for non-heap usage.
  • -XX:InitialRAMPercentage=40.0: Sets the initial heap size (-Xms) to the same percentage, improving startup performance.
  • -XX:NativeMemoryTracking=summary: Enables memory diagnostics to monitor JVM non-heap usage (required for the scripts below).

In the values.yaml file, configure these flags as:

  • rest.jvmOptions: For the Rest pod (as an array of strings).

  • rest.workerJvmOptions: For the Executor pod (as a single string).

Executor Pod: JVM + Native C++ Code

This section applies to the Executor pod, which runs both JVM and native C++ components.

The JVM flag -XX:MaxRAMPercentage manages only JVM_heap memory and reserves space for JVM_non_heap usage. It does not account for memory consumed by native C++ processes. You must factor that in manually.

Sizing Strategy:
K8s_Limit = Max_JVM_Heap + Max_JVM_Non_Heap + Max_C++_Memory + Buffer

To size this correctly:

  1. Measure Max_JVM_Non_Heap and Max_C++_Memory under load.

  2. Adjust Max_JVM_Heap (using rest.workerJvmOptions) so the total fits within the defined executor.resources.memory limit.

Scripts to Check Memory Usage (JVM vs. Native)

You can check memory usage by running these scripts inside the running executor pod. Start by connecting to the pod and finding the Java process ID (usually 1).

$ kubectl exec -it <your-executor-pod-name> -- /bin/bash
root@my-pod:/# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
java           1  5.5 10.1 5456252 833308 ?      Ssl  10:00   0:35 /usr/bin/java ...

Script 1: Check JVM Internal Memory (Heap vs. Non-Heap)

Use jcmd to get the JVM’s internal memory details.

Make sure -XX:NativeMemoryTracking=summary is set in rest.workerJvmOptions.
#!/bin/bash
# check_jvm_nmt.sh

# PID of the Java process
PID=1
echo "--- JVM Native Memory Tracking (NMT) ---"
jcmd $PID VM.native_memory summary
Example output:
Native Memory Tracking:

Total: reserved=5075788KB, committed=1310123KB  <-- (A) Total JVM Committed Memory
-                 Java Heap (reserved=4194304KB, committed=1048576KB)
-                     Class (reserved=135444KB, committed=24460KB)
-                    Thread (reserved=22420KB, committed=22420KB)
...

The key value is (A) Total Committed: 1310123KB (~1.25Gi) — this shows total memory tracked by the JVM.

Script 2: Check Total Process Memory (JVM + C++)

Use pmap to view the process’s total memory footprint (JVM + native C++).

You may need to install procps-ng in your container to use this command.
#!/bin/bash
# check_total_process_memory.sh

# PID of the Java process
PID=1
echo "--- Total Process Memory (pmap) ---"
pmap -x $PID | tail -n 1
Example Output:
total kB         5229348  1845180  1800180
                          ^
                          |
                        (B) Total RSS (Actual RAM)

The key value is (B) Total RSS: 1845180KB (~1.76Gi) — this shows the actual total memory used by the container.

Setting the Appropriate JVM Heap Size

You can calculate the optimal JVM heap size by identifying how much memory your native C++ components use.
  1. Calculate Native C++ Memory Usage
    Native C++ Memory ≈ (B) Total RSS - (A) Total JVM Committed
    1845180KB - 1310123KB = 535057KB (~522MB)

    This means your native C++ code uses about 522MB.

  2. Plan the Pod Memory Budget

    Assume your analysis shows:

    • C++ Memory: ~522MB

    • JVM Non-Heap: ~217MB

      You choose a 2Gi (2048MB) container limit for safety.

    Non-Heap Budget:
    C++_Memory: 522MB
    JVM_Non_Heap: 217MB
    Safety_Buffer (~15%): 250MB
    Total Non-Heap: 522 + 217 + 250 = 989MB
    Available for JVM Heap:
    2048MB (K8s Limit) - 989MB (Non-Heap) = 1059MB
  3. Convert Heap Size to Percentage
    Percentage = (Desired_Heap / K8s_Limit) * 100
    Percentage = (1059MB / 2048MB) * 100 = 51.7%
    Conclusion: You should round down to 50.0 and update your values.yaml as follows:
    # In your values.yaml
    
    rest:
      ...
      # This controls the JVM heap for the executor pods
      # It is a single string
      workerJvmOptions: "-XX:InitialRAMPercentage=50.0 -XX:MaxRAMPercentage=50.0 -XX:NativeMemoryTracking=summary"
    
    executor:
      ...
      # This sets the total container limit for the executor
      resources:
        requests:
          memory: "2Gi"  # <-- Updated limit from our analysis
        limits:
          memory: "2Gi"  # <-- Updated limit from our analysis
    

    This configuration ensures a balanced allocation between JVM heap, non-heap, and native memory — preventing OutOfMemory (OOM) errors while maximizing performance.