HCL Commerce Version 9.1.10.0 or later

Verifying index integrity

You can use a specialized Ingest check pipeline to verify that the Elasticsearch index contains all the required data. This pipeline verifies that all incoming data is correctly inserted in the index. The pipeline compares the results of the Ingest process with the original input to ensure reliable data intake.

The integrity check consists of two parts. In HCL Commerce Search prior to Version 9.1.12, the check uses the pipelines auth.validate and live.validate, one for each environment. From Version 9.1.12 onwards, a second set of connectors, auth.validate.cas and live.validate.cas, is available for users of the Catalog Asset Store (CAS) indexing model. For more information about CAS indexing, see Choosing your index model. Each pipeline performs the following checks:
Store index validation
This stage of the pipeline verifies the existence of all documents. It counts the supported languages, catalogs and currencies in each document, and checks default field values against the database. This check is based on the Store ID and language ID.
Product Index Validation
This stage checks document existence by counting the number of documents for each product, item, bundle, variant and dynamic document type. It counts the number of attributes and checks the SEO URL for each catengry ID. If a URL is not present in the product index, it searches for it in the URL index and then the database. In addition, this stage checks whether products have items linked to them and verifies that the category, path and path_name section exist in the index for each Catentry.
Category Index Validation
Documents are checked for existence based on store, language, and catalog. This stage counts the number of parent single categories linked to, as well as the number of facets, children for non-leaf objects. It also verifies that a path section exists in each document.
Attribute Index Validation
This stage verifies the existence of all documents and counts the number of comparable, facet.zero, facet.search, merchandisable, searchable, ribbon, and swatchwable attributes. It also counts attribute values for each attribute. These checks are based on the store and language.
HCL Commerce Version 9.1.18.0 or later

Auto-Recovery for Misindexed Documents

When an index validation operation identifies a misindexed document, the new auto-recovery feature attempts to regenerate the indexed document automatically.

How Auto-Recovery Works

  • During validation, if a document is found to be misindexed, the system triggers a regeneration of that document.

  • Near Real Time (NRT) indexing is used for document recovery.

  • System safeguards ensure that Elasticsearch is not overloaded during the recovery process.

flow.index.recovery.threshold

Defines the batch size for auto-recovery after index validation (default: 100). Setting this to 0 disables auto-recovery. Controls the number of NRT operations per batch and includes a configurable delay between batches (default delay: 1 minute). To adjust the delay, modify the Penalty Duration in the Regenerate Document processor in NiFi under

Recover Failed Validation > Send Update
    Events
flow.index.recovery.maximum.retries

Defines the maximum number of retry attempts for auto-recovery if failures persist (default: 3 retries). It is recommended to rerun the validation after each auto-recovery attempt.

flow.validate.exemption

Allows exemption of specific validation codes and document IDs. Useful for known acceptable validation violations. Supports regular expressions for IDs. Example:

DI8026V: 1,2,3 ; DI8034V: [1-4][0-9]*7,
    [479][0-9]*

API endpoints

To trigger integrity check, below ingest API is used:

  • https://ingest-host:ingest-port/connectors/auth.validate/run?storeId=storeIdt
    This API generates a runId which is used in the following APIs.
  • To verify the run status of the previous call, use:
    http://ingest-host:ingest-port/connectors/auth.validate/runs/runId/status
  • To check if there is any error in the run, use the following API call:
    http://ingest-host:ingest-port/connectors/auth.validate/runs/<runId>
  • To check validation logs use the following API call:
    http://ingest-host:ingest-port/connectors/auth.validate/runs/runId?logSeverity=V
Note: The logSeverity parameter is case sensitive. To extract the logs for the validation pipeline append a capital V.

HCL Commerce Version 9.1.18.0 or later

Catalog Entry Validation

Catalog entry validation helps ensure the accuracy, completeness, and consistency of product data across various indexes such as the Product Index, Inventory Index, and Price Index. This validation process identifies discrepancies and missing data, logs errors with severity codes, and provides detailed feedback for resolution. Validation covers the following areas:
  1. Inventory Validation
  2. Price Validation.
  3. Item-Product and Variant-Product Relationship Validation.
Inventory validation
Inventory Validation checks inventory quantity for Product Variants and Items by comparing the inventories.total.quantity field in the Product Index with the same field in the Inventory Index. If the quantities under inventories.total.quantity in the product index andinventories.total.quantity in the inventory index mismatch or if the whole field is missing in the product index, then the validation status will show a message with Severity code V and Logging Code DI8031V along with the IDs of those catalog entries.
Price validation
Price Validation checks for List and Offer prices for each catalog entry type in the product index. If any of the fields is missing in the product index, the validation status will show a message with Severity code V and Logging Code DI8038V.

Price Validation also checks for contract prices under field prices.<contract_Id>.<Currency> in the product index and compares that with their respective contract prices for each currency in the Price index. If there is a mismatch between the prices for each contract and currencies, then the validation status will show a message with Severity code V and Logging Code DI8037V along with the IDs of those catalog entries.

Item-Product and Variant-Product relationship validation
Validates relationships between Items, Variants, and their parent Products. It checks for relationship.product.* fields in the Product Index for catalog entry type Variants and Items and compares them in the database. If any of those fields are missing, the validation status will show a message with Severity code V and Logging Code DI8035V and DI8036V along with the IDs of those catalog entries.