Service Implementation Team Runbooks and Ownership
This section is for operational teams responsible for executing documented runbooks for recurring tasks and issue resolution.
Runbooks describe step-by-step operational procedures that Service Implementation Team must follow for recurring tasks and common issues.
Ownership
| Role | Responsibility |
|---|---|
| Authoring and Updates | Product Engineering / Implementation team, with inputs from Tech BA, ETL Developer, and Metadata Architect. |
| Execution | Client Service Implementation Team Operations team. |
| Approval | Implementation Manager / Project Manager, in alignment with client operations lead. |
Runbooks should explicitly identify which steps are Service Implementation Team-only, and which steps require escalation before execution.
Minimum Runbooks Required
At minimum, the following runbooks must exist and be accessible to the Service Implementation Team:
- Runbook 1. Daily/Batch Execution Runbook
-
How to verify all scheduled jobs for the Canonical stack have run:
- Marketing Data Mart → LDZ ingestion
- LDZ → RDV transformations
- RDV / Mart → Customer 360 / Campaign 360 / Flowchart 360 / cdm_publish_db
- ML Model write-back: cdm_ingest_db → Customer 360 enrichment pipeline
- Runbook 2. Failure Handling Runbook (Per Layer)
-
- LDZ load failure (missing file, bad format, connectivity).
- RDV load failure (constraint violation, key mismatch, DV pattern violation).
- Interface Layer failure (schema mismatch, mapping metadata issue, transformation error).
- 360-layer failure (aggregations, joins, pre-aggregated tables, materialized views) covering Customer 360, Campaign 360, and Flowchart 360.
- ML Model ingestion / enrichment failure (cdm_ingest_db write-back or Customer 360 enrichment step).
- Runbook 3. Rerun and Backfill Runbook
-
- How to safely re-run a failed job for a specific date/batch.
- How to handle partial loads.
- Preconditions and post-checks for reruns.
- Runbook 4. Release and Configuration Change Runbook
-
- How to validate after a code/config deployment.
- What basic smoke tests Service Implementation Team must execute.
- Runbook 5. Basic Data Validation Runbook (Non-Business)
-
- Row-count comparisons between layers where defined (e.g., LDZ vs RDV).
- Key technical checks (duplicate keys, nulls in non-null technical columns) when explicitly documented.
- Flowchart 360 validation: flowchart_id present, execution_count > 0, pre-aggregated metric columns populated.
- Runbook 6. Oracle/SqlServer/Snowflake Setup DAG Execution Runbook (Applicable for CouldNative Setup)
Required for all Oracle/SQL Server/Snowflake schemas environment setups and rebuilds.
Three new Airflow DAGs must be run in sequence during initial Oracle/SQL Server/Snowflake CDM setup or any environment rebuild. Service Implementation Team must have a documented procedure for each step.DAG Name Purpose When to Run airflow_variable_sync Syncs all required Airflow variables for the Oracle environment. First, before any other DAG on a new setup. ddl_execution_dag_multidb Automatically creates all Oracle/ schemas,SQL Server/snowflake tables and view. Once during initial Oracle/ schemas, SQL Server/snowflake CDM setup or rebuild. etl_date_control_update_dag Updates the ETL date control table with correct business dates. Before every load run, Day 0 and each BAU run. Important: These DAGs replace manual DDL script execution. Service Implementation Team must not execute DDL scripts manually when these DAGs are available.