Debugging and Validation Steps

After the first ETL run, use the following structured debugging checklist to validate that data has loaded correctly from source to LDZ. Work through each check in sequence.

Row Count Reconciliation

The most basic validation. For every source-to-target entity pair, compare row counts. Any discrepancy must be explained.

Check SQL Pattern Expected outcome
Source row count SELECT COUNT(*) FROM HB_CUST_MASTER Record as baseline. E.g., 4,200,000
Target row count SELECT COUNT(*) FROM ldz_party_demographic WHERE rec_src = 'HB_CRM' Should equal source count for 1:1 mappings
1:N target count SELECT COUNT(*) FROM ldz_party_identity WHERE rec_src = 'HB_CRM' Should be >= source count (multiple IDs per customer)
Rejected rows Check ETL audit log for rec_rej_cnt Rejected rows must be < 0.1% of total unless documented

Null Check on Mandatory Fields

Entity Mandatory columns to check Debug query pattern
ldz_party_demographic partyid, rec_src SELECT COUNT(*) FROM ldz_party_demographic WHERE partyid IS NULL OR rec_src IS NULL → Expected: 0
ldz_party_identity partyid, id_typ, id_val, rec_src SELECT COUNT(*) FROM ldz_party_identity WHERE id_val IS NULL → Expected: 0
ldz_account_dtl acct_num, acct_typ, rec_src SELECT COUNT(*) FROM ldz_account_dtl WHERE acct_num IS NULL → Expected: 0
ldz_account_party acct_num, acct_typ, partyid, role_typ, rec_src SELECT COUNT(*) FROM ldz_account_party WHERE partyid IS NULL → Expected: 0

Domain / Code Value Checks

Validate that all code conversions have been applied correctly. Any unexpected code value indicates either a new source code introduced after the mapping was completed, or a conversion logic error.

Entity / Column Validation query Expected outcome
party_demographic.gender SELECT DISTINCT gender FROM ldz_party_demographic Only: MALE, FEMALE, UNKNOWN
party_demographic.mrtl_sta SELECT DISTINCT mrtl_sta FROM ldz_party_demographic Only: SINGLE, MARRIED, DIVORCED, WIDOWED, NULL
account_dtl.acct_typ SELECT DISTINCT acct_typ FROM ldz_account_dtl Only: SAVINGS, CURRENT, FIXED_DEPOSIT, RECURRING
account_dtl.acct_sta SELECT DISTINCT acct_sta FROM ldz_account_dtl Only: ACTIVE, DORMANT, CLOSED
account_party.role_typ SELECT DISTINCT role_typ FROM ldz_account_party Only: PRIMARY, JOINT, GUARANTOR, NOMINEE

Referential Integrity Checks

Every FK relationship defined in the mapping must hold in the loaded data. Orphan records indicate load sequencing issues or missing data.

Relationship Debug query Expected outcome
account_party → party_demographic SELECT COUNT(*) FROM ldz_account_party ap LEFT JOIN ldz_party_demographic pd ON ap.partyid = pd.partyid AND ap.rec_src = pd.rec_src WHERE pd.partyid IS NULL 0 orphan account_party rows
account_party → account_dtl SELECT COUNT(*) FROM ldz_account_party ap LEFT JOIN ldz_account_dtl ad ON ap.acct_num = ad.acct_num AND ap.acct_typ = ad.acct_typ WHERE ad.acct_num IS NULL 0 orphan account_party rows
party_identity → party_demographic SELECT COUNT(*) FROM ldz_party_identity pi LEFT JOIN ldz_party_demographic pd ON pi.partyid = pd.partyid WHERE pd.partyid IS NULL 0 orphan identity rows

Date Sanity Checks

Check Debug query
Future date of birth SELECT COUNT(*) FROM ldz_party_demographic WHERE dob > SYSDATE → Expected: 0
DOB before 1900 SELECT COUNT(*) FROM ldz_party_demographic WHERE dob < DATE '1900-01-01' → Expected: 0
Account closed before it opened SELECT COUNT(*) FROM ldz_account_dtl WHERE close_dt < open_dt → Expected: 0
Transaction date in future SELECT COUNT(*) FROM ldz_acct_party_trans WHERE txn_dt > SYSDATE → Expected: 0
Passport expired at load time (if present) SELECT COUNT(*) FROM ldz_party_identity WHERE id_typ = 'PASSPORT' AND id_exp_dt < SYSDATE → Document count, raise alert

PII Masking Verification

Verify that PII fields have been correctly masked or tokenised before the data lands in the LDZ:

PII Field Verification query Expected result
Aadhaar in party_identity SELECT id_val FROM ldz_party_identity WHERE id_typ = 'AADHAAR' FETCH FIRST 5 ROWS ONLY First 8 digits masked: xxxxxxxx1234
Email in party_contact SELECT cntct_val FROM ldz_party_contact WHERE cntct_typ = 'EMAIL' FETCH FIRST 5 ROWS ONLY Lowercase normalised; no leading/trailing spaces
Phone in party_contact SELECT cntct_val FROM ldz_party_contact WHERE cntct_typ IN ('MOBILE','HOME_PHONE') E.164 format: +91XXXXXXXXXX