Verifying index integrity
You can use a specialized Ingest check pipeline to verify that the Elasticsearch index contains all the required data. This pipeline verifies that all incoming data is correctly inserted in the index. The pipeline compares the results of the Ingest process with the original input to ensure reliable data intake.
auth.validate and
live.validate, one for each environment. From Version 9.1.12 onwards, a
second set of connectors, auth.validate.cas and
live.validate.cas, is available for users of the Catalog Asset Store (CAS)
indexing model. For more information about CAS indexing, see Choosing your index model. Each pipeline performs the following checks:
- Store index validation
- This stage of the pipeline verifies the existence of all documents. It counts the supported languages, catalogs and currencies in each document, and checks default field values against the database. This check is based on the Store ID and language ID.
- Product Index Validation
- This stage checks document existence by counting the number of documents for each product, item, bundle, variant and dynamic document type. It counts the number of attributes and checks the SEO URL for each catengry ID. If a URL is not present in the product index, it searches for it in the URL index and then the database. In addition, this stage checks whether products have items linked to them and verifies that the category, path and path_name section exist in the index for each Catentry.
- Category Index Validation
- Documents are checked for existence based on store, language, and catalog. This stage counts the number of parent single categories linked to, as well as the number of facets, children for non-leaf objects. It also verifies that a path section exists in each document.
- Attribute Index Validation
- This stage verifies the existence of all documents and counts the number of comparable, facet.zero, facet.search, merchandisable, searchable, ribbon, and swatchwable attributes. It also counts attribute values for each attribute. These checks are based on the store and language.

Auto-Recovery for Misindexed Documents
When an index validation operation identifies a misindexed document, the new auto-recovery feature attempts to regenerate the indexed document automatically.
How Auto-Recovery Works
-
During validation, if a document is found to be misindexed, the system triggers a regeneration of that document.
-
Near Real Time (NRT) indexing is used for document recovery.
-
System safeguards ensure that Elasticsearch is not overloaded during the recovery process.
- flow.index.recovery.threshold
-
Defines the batch size for auto-recovery after index validation (default: 100). Setting this to 0 disables auto-recovery. Controls the number of NRT operations per batch and includes a configurable delay between batches (default delay: 1 minute). To adjust the delay, modify the Penalty Duration in the Regenerate Document processor in NiFi under
Recover Failed Validation > Send Update Events - flow.index.recovery.maximum.retries
-
Defines the maximum number of retry attempts for auto-recovery if failures persist (default: 3 retries). It is recommended to rerun the validation after each auto-recovery attempt.
- flow.validate.exemption
-
Allows exemption of specific validation codes and document IDs. Useful for known acceptable validation violations. Supports regular expressions for IDs. Example:
DI8026V: 1,2,3 ; DI8034V: [1-4][0-9]*7, [479][0-9]*
API endpoints
To trigger integrity check, below ingest API is used:
This API generates a runId which is used in the following APIs.https://ingest-host:ingest-port/connectors/auth.validate/run?storeId=storeIdt- To verify the run status of the previous call,
use:
http://ingest-host:ingest-port/connectors/auth.validate/runs/runId/status - To check if there is any error in the run, use the following
API call:
http://ingest-host:ingest-port/connectors/auth.validate/runs/<runId> - To check validation logs use the following
API call:
http://ingest-host:ingest-port/connectors/auth.validate/runs/runId?logSeverity=V
V.
Catalog Entry Validation
- Inventory Validation
- Price Validation.
- Item-Product and Variant-Product Relationship Validation.
- Inventory validation
- Inventory Validation checks inventory quantity for Product Variants and
Items by comparing the
inventories.total.quantityfield in the Product Index with the same field in the Inventory Index. If the quantities underinventories.total.quantityin the product index andinventories.total.quantityin the inventory index mismatch or if the whole field is missing in the product index, then the validation status will show a message with Severity code V and Logging Code DI8031V along with the IDs of those catalog entries.
- Price validation
- Price Validation checks for List and Offer prices for each catalog entry
type in the product index. If any of the fields is missing in the product
index, the validation status will show a message with Severity code V and
Logging Code DI8038V.
Price Validation also checks for contract prices under field prices.<contract_Id>.<Currency> in the product index and compares that with their respective contract prices for each currency in the Price index. If there is a mismatch between the prices for each contract and currencies, then the validation status will show a message with Severity code V and Logging Code DI8037V along with the IDs of those catalog entries.
- Item-Product and Variant-Product relationship validation
- Validates relationships between Items, Variants, and their parent Products. It checks for relationship.product.* fields in the Product Index for catalog entry type Variants and Items and compares them in the database. If any of those fields are missing, the validation status will show a message with Severity code V and Logging Code DI8035V and DI8036V along with the IDs of those catalog entries.