Performance Sizing
Data Volume Overview
| Audiences | SRC_CH (Contacts) | SRC_CH (Responses) |
|---|---|---|
| 100,000 | 11 Million | 6 Million |
Airflow (for DAG Orchestration)
| CPU | RAM | Disk |
|---|---|---|
| 8 cores | 32 GB | 500 GB |
Canonical DB (Oracle)
| CPU | RAM | Disk | Oracle Node |
|---|---|---|---|
| 16 cores | 64 GB | 500 GB | 1 |
Canonical DB (SQL Server)
| CPU | RAM | Disk | SQL Server Node |
|---|---|---|---|
| 16 cores | 72 GB | 750 GB | 1 |
Snowflake sizing is workload-driven and primarily governed by Virtual Warehouse configuration. Initial recommendation is to start with Medium warehouse for ETL and Small for interactive workloads, and scale based on concurrency and query performance.
Note: This is minimum sizing. Actual production environments
require higher specifications depending on concurrency, ETL frequency, query
complexity, number of DAGs, task concurrency, and data volume.
Tuning Recommendations
For optimal performance, database tuning should be carried out by an experienced Database Administrator (DBA). The following recommendations provide guidance for improving scalability, performance, and reliability.
Core Database Optimization
- Cluster & High Availability
- Add database nodes and enable Oracle RAC to improve load balancing, reduce contention, and increase throughput.
- Storage & Compute Scaling
- Scale memory to 128 GB–256 GB RAM.
- Partitioning Strategy
- Implement table partitioning to reduce I/O and improve query performance on large datasets.
- Optimizer & Statistics
- Refresh optimizer statistics regularly to ensure efficient execution plans and consistent performance.
ETL & Data Processing Optimization
- Use bulk loading mechanisms to improve ingestion performance.
- Implement incremental data loads instead of full refreshes.
- Schedule heavy ETL workloads during off-peak hours.
- Optimize CSV generation by increasing chunk size.
- Use parallel threads aligned with redo log capacity for faster execution.
Backup, Recovery & Monitoring
- Perform daily RMAN backups and manage archive logs effectively.
- Regularly test recovery procedures to ensure reliability.
- Monitor system performance using AWR/ASH reports.
- Set alerts for critical thresholds (CPU, memory, I/O, blocking).
Scalability & Future Readiness
- Adopt a multi-node architecture for horizontal scalability.
- Separate compute and storage layers where applicable.
- Ensure schema design supports growth in Customer History (CH) and Response History (RH) datasets.
New Feature Optimization (Flowchart 360)
- Optimize queries handling execution-level granular data.
- Create indexes on:
flowchart_idcampaign_codeexecution timestamps
- Pre-aggregate frequently used metrics (clicks, opens, responses) to reduce runtime computation.
ML Model Performance (STO & NBC)
- Optimize feature extraction queries from Customer 360.
- Use materialized views or precomputed feature tables.
- Schedule feature generation during off-peak hours.
- Ensure efficient write-back of ML outputs into Customer 360 for downstream use.
Multi-Database Tuning Considerations
- Oracle
- Continue leveraging:
- Oracle RAC for clustering
- Table partitioning
- AWR/ASH for performance monitoring
- Snowflake
- Size Virtual Warehouses (X-Small → X-Large) based on workload.
- SQL Server
- Enable:
- In-Memory OLTP
- Columnstore Indexes for analytical workloads