Storage
-
Defining Storage: PVs and PVCs
Persistent storage in Kubernetes is based on two key concepts:
- PersistentVolume (PV): The actual storage resource in the cluster (for example, an NFS share or a cloud disk). It is provisioned by an administrator or a dynamic provisioner and has a fixed capacity and access mode.
- PersistentVolumeClaim (PVC): A request for storage made by an application. It defines the required size (for example, 20 Gi) and access mode (for example, ReadWriteOnce).
Kubernetes binds a PVC to a suitable PV. The application’s pods then mount the PVC to access storage without knowing the details of the underlying hardware.
-
Required PersistentVolumeClaims (PVCs)
The PVCs created depend on which components are enabled in your deployment.
Key Application PVCs
The following table provides the Key application PVCs details.PVC Name Purpose Default Size Access Mode Notes server-files Stores essential files for the server component. 20 Gi ReadWriteOnce — rest-data (shared) Main data volume for the REST pod. 20 Gi ReadWriteOnce Shared with the server pod. Both must run on the same node. Dependency PVCs (If Deployed)
These are created by the included dependencies (Redis, MongoDB) if they are deployed as part of this application.
Following tables provides the detilas of the Dependcy PVCs.PVC Name Purpose Default Size Enabled When redis-data Stores Redis cache or database data if persistence is enabled. Defined in Redis configuration. Redis persistence enabled. mongodb-data Stores MongoDB data if persistence is enabled. Defined in MongoDB configuration. MongoDB persistence enabled. Optional PVCs
These are only created if you enable their features.
The following table provides the detauls of the Oprional PVcs.
PVC Name Purpose Default Size Access Mode Enabled When kafkaLink-data Stores Kafka-Link component data. 2 Gi ReadWriteOnce Redis persistence enabled. customConnectors-data Stores custom connector data. 10 Gi ReadWriteOnce (or ReadWriteMany if configured) MongoDB persistence enabled. -
StorageClasses
A StorageClass lets administrators define different types of storage they provide, such as fast-ssd (SSD-based), slow-hdd (HDD-based), or shared-nfs (NFS-based).
Check the available StorageClasses in your cluster using the following kubectl command:kubectl get storageclass # OR for short kubectl get scFor example: if you are working on AWS You will see an output like this. The (default) one is key.NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE gp2 (default) kubernetes.io/aws-ebs Delete Immediate true 120d nfs-client cluster.local/nfs Retain Immediate false 80dHow StorageClasses Are Used-
This application can be configured to use your cluster's default StorageClass (like gp2 in the example above).
-
Alternatively, you can configure it to use a specific StorageClass by name (for example nfs-client).
- Access Mode Note: Be careful. The application defaults request ReadWriteOnce (RWO), which works for most cloud storage (gp2, azure-disk). If you need to share a volume across many nodes, you would need a ReadWriteMany (RWX) provider, like NFS, and ensure the accessMode is set to ReadWriteMany.
-
-
Reclaim Policy Explained
The Reclaim Policy (persistentVolumeReclaimPolicy) is a configuration setting on both the StorageClass and PersistentVolume (PV). It defines what Kubernetes should do with a PV and its underlying data after the associated PersistentVolumeClaim (PVC) is deleted.
Retain (Recommended for Production):
-
When you delete the PVC, the PV remains intact.
-
The data stored on the underlying disk is preserved.
-
An administrator must manually back up and delete the PV to release the resource.
-
Use this option for all production environments where data persistence is critical.
Delete (Recommended for Development/Test):
-
When you delete the PVC, the PV and its underlying storage (for example, an AWS EBS volume) are automatically deleted.
-
All stored data is permanently lost.
-
This is the default setting for many cloud storage classes, such as
gp2on AWS.
How to Check Your Policy:kubectl get sc <your-storage-class-name> -o yaml -
- How to Plan Volume SizesThe application components have default sizes (e.g., 20Gi). Here is how to plan for your specific needs:
- Calculate the Baseline
A standard deployment (Server, Rest, Redis, MongoDB) requires at least the combined size of all enabled components. A typical baseline starts around 60Gi, but verify the values for your specific configuration.
- Estimate Growth
Plan for future data growth by considering the following:
-
server-files: Estimate the number of files, maps, or configurations added per week.
-
rest-data: Estimate how much transactional data is generated by the Rest service.
-
mongodb-data: This is the primary database. Estimate daily record creation and average record size. This volume usually grows the fastest.
-
- Monitor Usage
Continuous monitoring provides the most accurate data for scaling decisions.
-
Use monitoring tools such as Prometheus and Grafana to track PVC disk usage.
-
Alternatively, check disk usage manually: df -h
-
- Enable Volume Expansion (if supported)If your default StorageClass supports volume expansion (allowVolumeExpansion: true), you can resize PVCs as needed. For example, if the rest-data PVC is running out of space:
kubectl edit pvc rest-data-pvcUpdate the size value—for example, from 20Gi to 50Gi.
Kubernetes automatically resizes the underlying disk without downtime.
Action: Verify that your default StorageClass supports volume expansion.
- Calculate the Baseline
-
Persistent Volumes (PV) and Persistent Volume Claims (PVC) in Link
Link uses multiple persistent volumes (PVs) to store file artifacts and other data. During installation, the necessary persistent volume claims (PVCs) are automatically created to provision and reference the required PVs.
The following table lists the PVCs used in the Link installation.Note: The values shown in the PVC column represent suffixes of the names given to the PVCs when they are created automatically by the Link installation. The full generated PVC names will vary from one installation to another and will be based on the release name provided in the helm install command.PVC Component Container Mount Point Primary Use rest-data(see note 1)rest, executor /dataFiles uploaded via Runtime API (Data Storage), Flow Engine files server-filesserver /opt/data/hipfilesCompiled schemas and maps, files uploaded through UI server-data(see note 2)server /dataFiles uploaded via Deployment API (Data Storage) mongodb(see note 3)mongo /dataMongo repository data redis-master-0(see note 4)redis /dataRedis persisted data custom-connectors(see note 5)rest, executor, server /opt/custom/connectorsfor the custom custom connectors
-
Shared PVC: The rest-data PVC is always shared between the Rest and Executor pod replicas.
-
Conditional PVC: The server-data PVC exists only if the server component does not share the rest-data PVC (server.persistence.data.shareWithRest=false).
-
By default, the server shares the rest-data PVC with the Rest component.
-
Files stored in this PVC are accessible at the /data mount point in both the server and rest pods.
-
-
The mongodb PVC is created only if MongoDB is deployed with Link (mongo.deploy=true).
-
If Link uses an external MongoDB instance, those PVs are owned and managed externally.
-
-
The redis-master-0 PVC is created only if Redis is deployed with Link (redis.deploy=true).
-
If Link uses an external Redis instance, those PVs are also managed externally.
-
When installed with the default chart parameters, Redis creates this PVC through a StatefulSet. The PV remains in the cluster after uninstalling Link and must be deleted manually for cleanup.
-
-
The custom-connectors PVC is shared among the server, rest, and executor pods (customConnectors.enabled=true) and is mounted at /opt/custom/connectors.
If subPath is configured for the /data mount in the Rest pod, it applies to both Rest and Executor pods.
If a subPath is configured for the /data mount in the Server pod, it applies only if the directory is not shared with the Rest pod. If sharing is enabled, the subPath from the Rest pod takes precedence.
By default, Link dynamically provisions PVCs with a Delete reclaim policy. This causes the PVCs and their PVs to be automatically deleted when Link is uninstalled, except for the Redis master PVC, which must be deleted manually.
To retain your storage after uninstalling Link:
-
Pre-provision and configure the required PVCs and PVs before installation.
-
Reference these PVCs during installation in your Helm command (see Preparing PV & PVC for details).