CDP System Requirements

There are two parts of the deployment:

  • Micro services based components, to be deployed on containerization platform. Here, Red Hat OpenShift is the choice of the Container orchestration platform.
  • Virtual Machine (VM) based deployment.

Before proceeding with the installation of the required components, you have to ensure that Red Hat OpenShift is installed and properly configured. Refer to the Red Hat OpenShift documentation for installation guidelines.

Also, below tools are expected to present in the deployment to facilitate the installation:

  1. OpenShift CLI ("oc" command)
  2. Kubernetes command line tool ("kubectl" command)
  3. Helm CLI tool ("helm" command)

Hardware Requirements

Given below is the minimum hardware requirement for the CDP deployment.

Standalone VMs

Server/Node No. of Server/Node CPU Memory Storage Network
Bastion host 1 2 vCPUs for a 7h 12m burst 8.0 GiB EBS only Low to Moderate
Aerospike DB 2 4 vCPUs 16.0 GiB 150 GB NVMe SSD Up to 10 Gigabit
DMP server 1 2 vCPUs 8.0 GiB EBS only Up to 12.5 Gigabit
Druid server 2 16 vCPUs 128.0 GiB EBS only Up to 10 Gigabit
Mongo DB Server 4 4 vCPUs 16.0 GiB 150 GB NVMe SSD Up to 10 Gigabit
Mongo DB Server 1 2 vCPUs for a 4h 48m burst 2.0 GiB EBS only Up to 5 Gigabit
Nifi DB 3 2 vCPUs 16.0 GiB 75 GB NVMe SSD Up to 10 Gigabit
SFTP Server 1 2 vCPUs 8.0 GiB EBS only Up to 12.5 Gigabit
postgres-1 2 2 vCPUs 8.0 GiB EBS only Up to 12.5 Gigabit
RabbitMq 1 2 vCPUs 8.0 GiB EBS only Up to 10 Gigabit
TC Redis 1 2 vCPUs 16.0 GiB EBS only Up to 10 Gigabit
Redis & RMQ for Celery 1 2 vCPUs 16.0 GiB EBS only Up to 10 Gigabit
Scheduler 1 2 vCPUs 16.0 GiB EBS only Up to 10 Gigabit

EKS Cluster

Server/Node CPU Memory Storage Network
3 Nodes 32 vCPUs 64.0 GiB EBS only 12.5 Gigabit
18 Nodes 4 vCPUs 16.0 GiB EBS only Up to 12.5 Gigabit

EMR Cluster

Server/Nodes CPU Memory Storage Network
Primary Node 4 vCPUs 16.0 GiB EBS only Up to 10 Gigabit
Core Node 4 vCPUs 16.0 GiB EBS only Up to 10 Gigabit

Software requirements

Given below is a list of components to be deployed and configured before the deployment for CDP Core modules. CDP utilizes the functionalities from these modules to perform the end-to-end operations.

Application/Service Deployment Type Version
HashiCorp Vault Helm chart Vault v1.17.2
Red Hat Quay Operator
RHBK Operator, OpenShift RBAC NA
NFS Storage Class on OpenShift
AMQ streams OpenShift Operator
SMTP server
Trino Helm chart v0.7.0
Apache Spark (Spark ETL jobs) Helm chart 3.5.1
HAProxy(route/ingress) Route on OpenShift
MinIO Helm chart v5.0.15
Apache Airflow Helm chart 2.9.2
Stackable Operator for Apache Spark (certified) / Spark Helm Operator(Community) Helm chart 24.3.0
PostgreSQL PgSQL on VM
AMQ broker OpenShift Operator
3Scale Api-Gateway