Introduction

CDM Philosophy for Retail Banking

At its core, retail banking is not a collection of disconnected systems, channels, or products. It is a structured relationship between a customer and the bank. The Canonical Data Model (CDM) is designed around this fundamental business reality.

Every retail banking interaction starts with a Party — an individual or organisation that wants to establish a relationship with the bank. The bank validates this relationship through onboarding, KYC, compliance, and risk checks. Once the relationship is established, the bank creates an Account, which acts as the formal agreement and operational container through which services are delivered to the customer. An Account by itself has no meaning unless it is associated with a Product — such as a savings account, current account, fixed deposit, credit card, or loan product — which defines the rules, capabilities, pricing, limits, and behaviour of the service being offered.

Once a valid relationship exists between Party, Account, and Product, the customer begins interacting with the bank. These interactions are captured as Transactions and Events. Transactions represent the operational activity happening on the account — deposits, withdrawals, transfers, payments, interest accruals, charges, repayments, and other financial or non-financial actions. Together, these concepts form the foundational lifecycle of retail banking operations.

The CDM standardises these concepts because they are universally present across banking processes, channels, and systems. Different applications may use different terminologies or structures, but the underlying business concepts remain the same. By defining canonical entities such as Party, Account, Product, and Transaction, the CDM creates a common language that enables interoperability, consistency, governance, analytics, and scalable system integration across the enterprise.

A Party represents who the bank is serving.
A Product represents what service the bank offers.
An Account represents the operational relationship between the bank and the customer.
A Transaction represents how the customer uses that service over time.

These concepts are the foundational building blocks upon which all higher-level banking capabilities — campaigns, offers, servicing, risk, analytics, compliance, and customer engagement — are built.

Important: CDM Design Principle: The CDM does not mirror any single source system. It models the business reality that exists independently of any application, platform, or technology stack. When mapping a source system into CDM, always ask: which business concept does this field represent — not which table does it come from.

What is the Master Interface Document (MID)?

The MID is an Excel workbook that acts as the formal bridge between a client's source system and the CDM's Landing Zone (LDZ) structures. Each tab in the MID corresponds to one CDM entity (e.g., PARTY_DEMOGRAPHIC, ACCOUNT_DTL, PRODUCT_DTL). For each entity, the document captures:

The target CDM field name, data type, and business description
The corresponding source table, source column, and source data type
Transformation logic, default values, format conversions, and DQ rules
PII classification, mandatory flags, and open questions

Once complete, the MID becomes the primary input to the metadata-driven ETL pipeline — code generation, data quality controls, and lineage are all derived from it.

What this guide covers

This guide walks through the full mapping lifecycle:

Collecting pre-requisite information about the source system
Understanding which CDM subject area to target
Applying mapping principles to avoid common mistakes
Completing the eight-step mapping method
Working through realistic examples from a retail banking source system
Validating and debugging the completed mapping
Handing off a signed-off MID ready for engineering

CDM Implementation Scenarios Overview

The approach to CDM onboarding differs significantly depending on whether the client is starting fresh or has an existing marketing platform already in production. The table below summarises the key differences across eight delivery dimensions. Detailed step-by-step approaches for each scenario follow.

Table 1. Implementation Scenarios: New Client vs Existing Client
DIMENSION	NEW CLIENT	EXISTING CLIENT
Starting Point	No existing CDM instance. Greenfield deployment from scratch across all layers.	Live CDM or legacy UDS/Campaign platform in production. Migration and coexistence required.
Discovery Scope	Source system inventory, business key identification, subject area selection, MID pre-reqs from scratch.	Deep dive into existing UDS, Contact History and Response History schemas. History volume scoping and migration complexity scoring added on top of standard discovery.
Data Mapping	Standard MID exercise: source tables mapped to CDM entities for the first time. One mapping cycle.	Bespoke migration mapping: legacy table structures, non-canonical codes and historical data shapes all need custom transformation rules in addition to standard MID.
Day 0 / Go-Live	Single go-live event. CDM is live with current data. No historical backfill required by default.	Day 0 = historical migration complete. Full history loaded into CDM before any new campaigns run on it. 360 Views validated before Day 1+ activation.
Campaign Continuity	No existing campaigns to preserve. New campaigns built on CDM constructs from Day 1.	Parallel run: existing campaigns continue on legacy platform uninterrupted. New campaigns built on CDM in parallel. No forced cutover until client is ready.
MaxAI Activation	MaxAI agents activated at Go-Live once 360 Views are validated. No prior history dependency.	MaxAI activated during Parallel Run phase using full migrated history. Campaign AI Replication Agent replicates best-performing legacy campaigns onto CDM segments.
360 Views	Customer, Campaign and Journey 360 built from new source mappings. Populated from first ETL run.	Customer 360 and Campaign 360 enriched with full historical depth from Contact History and Response History migration. Richer segmentation signal from Day 1+.
Post Go-Live	BAU CDM operations. Incremental source additions follow standard MID pattern. Faster subsequent onboarding.	Legacy platform deprecated for new work after cutover. CDM becomes sole data layer. MaxAI scaled to full capacity once parallel run is stable.

Part A — Greenfield Implementation: New Client

Scenario A applies when the client has no existing CDM deployment — a clean-slate, fresh installation.

For a new client with no existing CDM deployment, onboarding follows a linear six-phase progression. The primary deliverable at each step gates entry into the next.

Step 1: Discovery

Collect source system inventory, identify all in-scope tables and business keys, confirm data volumes, and complete the pre-requisites checklist. Agree CDM subject areas in scope. Gate 1 sign-off required before proceeding.

Step 2: Design

Complete the Master Interface Document (MID) for all in-scope entities. Define transformation rules, code conversions, and PII classifications. Obtain three-way sign-off: business, architecture, and engineering. Gate 2 sign-off required before build begins.

Step 3: Engineering

Deploy LDZ, RDV, and BDV layers. Build ETL pipelines driven by the metadata model. Construct Customer 360, Campaign 360, and Journey 360 Views. Populate the Feature Store for MaxAI readiness.

Step 4: Testing

Execute UAT against acceptance criteria. Run reconciliation counts and validate 360 View outputs. Obtain UAT sign-off from the business stakeholder. Confirm all performance benchmarks are met.

Step 5: Go-Live

Deploy to production. Activate MaxAI agents once 360 Views are validated. Hand over BAU runbook and monitoring setup. Begin hypercare period. Subsequent source additions reuse the existing MID pattern.

Reference: Common Reference Sections applicable to Part A (Greenfield): Sections 2–17: Interface Mapping Guide, Subject Area Guide, Mapping Principles, Worked Examples, DQ & PII Guidance, Debugging Steps, FAQ, Final Checklist.

Part B — Brownfield Implementation: Existing Client

Scenario B applies when the client has an existing live marketing platform (UDS / Campaign / legacy).

B.1 Step-by-Step Approach

For an existing client running a live campaign platform (UDS, Contact History, Response History), onboarding requires a migration-first approach before CDM can be used as the primary data source. The twelve-month programme is structured around a Day 0 milestone and a managed parallel run period.

Step 1: Assessment

Conduct a full deep-dive of the existing platform schema and data volumes. Score migration complexity per entity (simple / complex / bespoke). Scope the history depth required (e.g. 3–5 years of campaign history). Produce the migration complexity scorecard and confirm the CDM subject areas in scope.

Step 2: Migration Design

Build the bespoke migration mapping in the MID: source tables to CDM entities, legacy code conversions, historical data transformation rules. Design DDL and ETL blueprints for all layers. Obtain Gate 1 sign-off before any build activity begins.

Step 3: Day 0 Migration

Execute the full historical data load into LDZ, RDV, and BDV layers. Load Contact History into Campaign 360 and Response History into Campaign 360 response entities along with Campaign system tables. Build and validate Customer 360, Campaign 360, and Journey 360 Views. Obtain Day 0 acceptance sign-off confirming the migration is complete and validated.

Step 4: Parallel Run

Operate both platforms simultaneously. Existing campaigns continue uninterrupted on the legacy platform — zero disruption. New campaigns are built exclusively on CDM 360 Views. Activate the MaxAI Campaign AI Replication Agent to replicate best-performing legacy campaign logic onto CDM segments. Monitor delta sync from legacy to CDM for ongoing transaction and response data.

Step 5: MaxAI Activation (within Parallel Run)

With full migration history available, activate the Analytical and Campaign AI replication agent. Validate AI-driven segment quality against legacy campaign benchmarks. Confirm MaxAI is stable for four or more weeks before cutover is considered.

Step 6: Full Cutover

Migrate all remaining legacy campaigns to CDM constructs. Confirm no active jobs are still reading from the legacy primary schema. Archive the legacy platform as read-only for audit and compliance purposes. CDM becomes the single source of truth for all campaign operations. Hand over BAU runbook and complete MaxAI full-scale activation.

Important: Regardless of onboarding type, the Master Interface Document (MID) is the mandatory specification that drives all downstream engineering. No ETL build begins without a signed-off MID. The difference for an existing client is that the MID must also capture legacy code conversions, historical data shapes, and migration-specific transformation rules that a new client does not need.


Alternative Approach within Part B for clients who want Campaign AI capabilities without full CDM migration

B.2 Campaign AI Lightweight Approach - CH/RH Migration Only

Some existing clients do not want to undertake a full CDM migration, they want to continue operating their existing platform and campaigns as-is, but unlock Campaign AI capabilities through MaxAI. In this scenario, a full CDM deployment is not required. Only the historical Contact History (CH) and Response History (RH) data needs to be migrated into CDM to build Campaign 360 and Flowchart 360. These two views are the sole data inputs required for the Campaign AI Replication Agent and MaxAI campaign analytics.


NOT Required in this approach	Required in this approach
✗ Full LDZ / RDV / BDV pipeline deployment ✗ Party, Account, Product entity migration ✗ Full transaction history load ✗ Customer 360 view ✗ Source-to-CDM MID for all entities ✗ UDS deprecation or platform cutover	✓ CH (Contact History) full historical extract ✓ RH (Response History) full historical extract ✓ Campaign 360 pipeline activation ✓ Flowchart 360 pipeline activation ✓ MaxAI Campaign AI Replication Agent config ✓ CH/RH → Campaign 360 MID mapping only

What this unlocks: Once Campaign 360 and Flowchart 360 are live, the Campaign AI Replication Agent can analyse historical campaign performance across all contacts and responses, identify best-performing campaigns, and replicate that logic onto new CDM-native campaign constructs. MaxAI analytics dashboards activate on the migrated history for giving the client AI-driven campaign intelligence without any disruption to their existing platform or live campaigns.

Step-by-Step Approach

Step 1 — CH/RH Discovery & ScopingAssess the full Contact History and Response History data: volume, history depth required , data quality, and extraction method. Identify campaign attributes needed for AI scoring — contact date, channel, campaign code, offer code, response type, response date, revenue. Agree the history depth and scope with the business stakeholder before extraction begins. No full source system inventory is needed — scope is limited to CH and RH only.

Step 2 — Campaign 360 MappingComplete a targeted MID covering CH and RH entities only: source columns to Campaign 360 contact and response entities. Define transformation rules for campaign codes, channel codes, response type codes, and date formats. This is a significantly reduced mapping effort compared to a full CDM MID — Obtain business and architecture sign-off before build begins.

Step 3 — CH/RH Historical LoadExtract the full historical Contact History and Response History from the source platform. Load into the Campaign 360 contact and response entities. Run row count reconciliation and validate that campaign-to-contact and contact-to-response join integrity is maintained. Activate Flowchart 360 pipeline to build the flowchart performance view from the loaded data. Obtain Day 0 acceptance sign-off confirming history load is complete and validated.

Step 4 — Campaign AI ActivationActivate the Campaign AI Replication Agent. The agent analyses Campaign 360 and Flowchart 360 to identify top-performing historical campaign patterns — best segments, best channels, best offer combinations, and optimal send timing. Configure MaxAI analytics dashboards on the loaded campaign history. Validate AI recommendations against known business benchmarks before full activation. The client's existing platform continues operating normally throughout — no cutover, no disruption, no dependency on full CDM.

Choosing Between B.1 (Full Migration) and B.2 (Campaign AI Only)


DIMENSION	B.1 — FULL MIGRATION	B.2 — CAMPAIGN AI ONLY
CDM layers deployed	LDZ + RDV + BDV + all 360 Views	Campaign 360 + Flowchart 360 only
Data migrated	Full UDS + CH + RH + all entities	CH + RH history only
Entities mapped (MID)	All in-scope CDM entities	Campaign contact + response only
Platform impact	Full cutover — UDS retired for new work	None — existing platform untouched
MaxAI capability	Full: all agents + all 360 Views	Campaign AI Replication Agent + campaign analytics
Best for	Client wants unified CDM across all products	Client wants fast AI on campaigns with minimal change

Important: The B.2 approach does not provide a full unified customer view, full transaction history, or the complete CDM data backbone. If the client later wants to expand to full CDM capabilities, Customer 360, Offer Agent, full segmentation, they will need to undertake a full B.1 migration at that point. The CH/RH migration done in B.2 is fully reusable and does not need to be repeated.

Common Reference Sections applicable to Part B (Brownfield / Existing Client): Interface Mapping Guide, Subject Area Guide, Mapping Principles, Worked Examples, DQ & PII Guidance, Debugging Steps, FAQ, Final Checklist.

See the Common Reference Sections at the end of this document for the complete Interface Mapping Guide, Subject Area definitions, Mapping Principles, Worked Examples, Debugging steps, and Final Checklist.

Part C — CDM Upgrade: Previous Release to New Release

Scenario C applies when an existing CDM deployment is being upgraded to a new CDM release.

When upgrading CDM from one release to the next, the nature of the change determines the impact on existing pipelines, hash key computation, and historical data. This section covers the change category matrix and the full re-historization resolution guide that must be followed when a change affects the MD5/HASHKEY computation in the middle layer.

Data Model Upgrade — Change Category Matrix

Every change to the CDM data model is classified against five dimensions. The impact level (HIGH / MEDIUM / LOW) determines the required action. The most critical scenarios are those that alter the MD5 hash signature, as these affect all historical records in the middle layer.

Table 2. Data Model Upgrade — Change Category Matrix
Change Category	Layers Impacted	CDC / Hash Impact	History Impact	Action Required
New Pipeline Added	Raw → Mid → Target	No impact	None — net new	Deploy DDL across all 3 layers; configure source connector and mapping.
New Table Added	Raw → Mid → Target	New MD5 key defined	None — net new	Create table in all layers; define HASHKEY columns; execute initial full load.
New Column Added	Mid + Target	MD5 signature changes — history records affected	HIGH — full re-historization required	ALTER TABLE; recalculate HASHKEY; reload full history.
Column Modified (type / length)	Raw → Mid → Target	May alter MD5 output	MEDIUM — validate existing hash integrity	ALTER TABLE; test MD5 delta; partial or full reload if hash drift detected.
Column Deprecated	Raw → Mid → Target	MD5 footprint shrinks	MEDIUM — validate existing hash integrity	Deprecate in mapping; test MD5 delta; partial or full reload if hash drift detected.
Primary / Surrogate Key Change	All Layers	HASHKEY logic must be rebuilt	HIGH — all history records invalidated	Rebuild HASHKEY; truncate and full reload history; re-link all downstream entities.

Resolution Guide — Full Re-historization

Important: The middle layer computes an MD5 hash across all tracked columns to detect changes (CDC). When a new column is added, ALL existing history records carry an outdated hash — causing the pipeline to misinterpret unchanged records as modified on first run, corrupting change history.

Recommended Resolution — Full Re-historization (5 Steps):

Schema Change (All Layers) — ALTER TABLE in Raw, Middle, and Target layers. Add new column as nullable (safe default). Update pipeline column mapping and configuration.
Stop Pipeline (Incremental Loads) — Halt all incremental CDC jobs. Record last successful watermark / run ID. Ensure no partial loads are in-flight.
Full Source Reload (from Source) — Extract complete dataset from source system. Include all historical periods (old + new version). Validate row counts before proceeding.
Re-historization — Truncate history / target table. Pass full dataset through pipeline end-to-end. MD5 HASHKEY / HASHDIFF recalculated with new column included.
Validate & Resume — Reconcile record counts vs source. Spot-check HASHKEY values for new column. Re-enable incremental CDC pipeline.

Important: Always apply this procedure when a new column is included in the MD5/HASHKEY computation. If the new column is metadata-only and explicitly excluded from hash computation, a forward-only load with NULL backfill may be acceptable — confirm with the Data Architect before proceeding.

Reference: Common Reference Sections applicable to Part C (CDM Upgrade): Sections 2–17: Interface Mapping Guide, Subject Area Guide, Mapping Principles, Worked Examples, DQ & PII Guidance, Debugging Steps, FAQ, Final Checklist.