Index Load

Index Load is an indexing service that uses the Data Load framework to load data in parallel into one or more search extension indexes.

Index Load is used to populate contract prices when performance reasons require that your site use a separate Price extension index. For example, use Index Load with a Price extension index if your site contains more than 1000 contracts, or if you use an external source to populate prices.

Index Load provides the following benefits over populating the catalog entry index with price data:
  • It improves indexing performance by using local binding (embedded mode) on the search server to avoid making remote HTTP calls that use HTTPClient.
  • Multithreaded parallel indexing technique makes sharding possible within and across multiple extension indexes.
  • Metrics can be displayed by using the Index Load status command while indexing to help refine tuning parameters and improve performance throughput.

Index Load uses profiles to control the indexing behavior and characteristics for a search extension index. Index Load profiles are defined in the Index Load configuration file.

When you call Index Load, you can pass a profile name through a URL parameter named profile. The value of the profile parameter is used to resolve the actual file name to be loaded from the predefined configuration directory. Both the pattern name and Index Load configuration directory are defined as servlet initialization parameters in the web.xml of the Index Load servlet (SolrIndexLoadServlet).

Tuning Index Load contains more detailed information on how data flows through the multi-threaded indexing application and which tuning parameters can be used.

The following diagram shows a high-level overview of Index Load.
Index Load overview
Where
Index Load contains the following components:
Index Load Servlet (SolrIndexLoadServlet)
The Index Load interface. It accepts commands with input information such as profile, catalog, and store. The input information is used to look up the specified configuration files.
Loader Interface
Creates loader units to run based on the configured load item (loaditem). Only one loader exists, which can use several load items. Each load item includes a reader, and zero or several mediators.
Loader Item
The runnable unit for Index Load. You can pass multiple loader items in parallel, where every loader item is an independent load unit that is controlled by a single data loader.

Within a loader, a data reader exists which can read data in multiple threads, and optional mediators. The mediators are in a chain, where the output of a mediator is the input of another mediator, with a single data writer. The target of multiple loader items can be the same or different core instances.

Reader
Reads original physical data from data sources in parallel and passes it to the mediator. The SolrIndexLoadQueryReader is used by default to read data from relational databases as specified by the configuration files.
Mediator
The BusinessObjectMediator defines a common interface to take the input from the reader and transform it to follow the convert pattern as specified in the configuration files. You can provide zero or more mediators, where the output of a mediator is the input for the following mediator. When all the mediators finish transforming, the physical data writer persists the physical objects into Solr by calling the Solrj interface.
Batch Service
Adds Solr documents and commits them to the Solr server. Only one batch service serves each unique Solr core, with the ability to interact with multiple index writers. The batch service contains an internal queue for buffering unfinished documents from various writers. Once the input document is ready for indexing, it is dispatched to the Solr runtime service.

The batch service is used by default to populate the Price extension index when indexing contract prices by using Index Load.

Limitations

Be aware of the following Index Load limitation:
  • Index Load supports only extension indexes. Index Load does not support the Product, Category, or Unstructured indexes.