Performance Sizing Awareness
This section helps operations teams recognize performance deviations and know when to escalate. DBA teams own tuning decisions.
Service Implementation Team Operations is not responsible for performance tuning decisions, but must be able to recognize when observed job durations deviate significantly from expected baselines, and escalate appropriately. This chapter provides the Service Implementation Team with the reference context drawn from the HCL MaxAI CDM Redbook (Performance Sizing) and the Release Notes performance summary.
Baseline Performance Expectations (Service Implementation Team Reference)
| Performance Aspect | Expected Behavior | Service Implementation Team Action if Deviated |
|---|---|---|
| T+1 Ingestion SLA | Expected to complete within the T+1 batch window at banking-scale volumes | Escalate to DBA if SLA is consistently breached. On SQL Server backends, also notify L3/Service Team for JDBC tuning assessment. |
| Airflow Parallelization | Airflow parallelization is used to reduce load window durations. Multi-task DAGs are expected to run in parallel where configured. | If DAG tasks that normally run in parallel begin queuing sequentially, escalate to Service Team — this may indicate an Airflow worker resource constraint. |
| Partition-Optimised Structures | Canonical tables use partition-optimized structures for analytical workloads. Queries and loads against 360 views should be consistent and predictable. | Sustained degradation in 360 view query or materialization time should be escalated to DBA. |
| Snowflake Auto-Suspend | Snowflake Canonical virtual warehouse auto-suspends during idle periods to minimize compute cost and auto-resumes on demand. | First-task latency at DAG start on Snowflake is expected. If warehouse fails to resume, escalate to DBA (credit or permission issue). |
When Service Implementation Team Must Escalate Performance Issues
The Service Implementation Team must escalate to DBA when any of the following are observed, without attempting to tune or modify pipeline configurations independently:
- Job duration exceeds 200% of the documented historical baseline for that pipeline.
- T+1 SLA is breached for two or more consecutive batch runs on any backend.
- Storage capacity alerts are raised on the Canonical database host (Oracle) or cloud billing alerts are triggered (Snowflake).
- SQL Server CDC commit latency warnings appear repeatedly in JDBC logs.
For full data volume benchmarks and sizing reference tables, refer to the HCL MaxAI CDM Redbook, Performance Sizing. The Service Implementation Team does not need to interpret sizing tables directly but should be able to provide the Redbook reference to DBA and ETL Developer during performance escalations.
ML Model and Flowchart 360 Performance — Service Implementation Team Escalation Reference
The following additional performance indicators are introduced for the ML Model Integration and Flowchart 360 components. The Service Implementation Team must escalate to ETL Developer and/or MaxAI team if these indicators are breached.
Flowchart 360 Processing- Database indexes should exist on columns frequently used by Flowchart 360 processing, including flowchart_id, campaign_code, and execution timestamp fields.
- If Flowchart 360 model execution consistently exceeds the T+1 processing window, escalate to the DBA team. This typically indicates missing indexes, inefficient execution plans, or large unpartitioned table scans in upstream source tables.
- Pre-aggregated Flowchart 360 metrics, including clicks, opens, and responses per execution, should be populated after each scheduled run.
- If these metrics are unexpectedly null or zero, verify that the upstream BDV Flowchart source data is available and populated before escalating the issue.
- Historical Flowchart 360 data is generated only for flowcharts that contain engagement-related metrics. Flowcharts without engagement data are excluded from historical Flowchart 360 processing.
- Flowchart 360 processing is implemented using metadata-driven dbt models. Any changes to model logic, incremental processing behavior, or aggregation rules should be reviewed by the ETL Development team before deployment.
ML Model Integration (STO / NBC write-back):
- Feature extraction queries run against Customer 360. If ML ingestion DAG durations increase significantly, escalate to the ETL Development team. This may indicate missing materialized views, stale aggregates, or unoptimized feature extraction queries.
- ML prediction write-back into Customer 360 (NBC / STO attributes) should complete within the T+1 batch window. If Customer 360 enrichment timestamp is stale, check the ML ingestion DAG status in Airflow and escalate to the MaxAI team.
- Feature generation is watermark-driven and incremental. Do not trigger full-refresh of Customer 360 feature views without ETL Developer confirmation — this is outside Service Implementation Team scope.
For full performance benchmarks, tuning recommendations, and sizing tables by backend, refer to the Performance Sizing chapter in HCL MaxAI CDM Redbook. Service Implementation Team should share this reference with DBA and ETL Developer during any performance escalation.