WebSphere Commerce search extension indexes

Extension indexes, setup as index subtypes in WebSphere Commerce search, are used to keep data in a separate core for performance reasons.

The following extension indexes are available by default:

Inventory: The inventory index, a separate index that contains index data, is an extension of the product index. For accurate inventory status, you can refresh the inventory index more frequently than the product index.

Price: The price index, a separate index that contains price data. Prices are indexed using Index Load, as it can populate a large amount of data into a separate extension index faster than the Catalog Entry index can index price data.

You can use the default extension indexes, or setup your own extension index that best match your site's requirements.

General considerations

The WebSphere Commerce Search query component extension tries to replicate most of the Solr supported search features when working with an extension index. However, due to the complexity of the logic involved at runtime, the following list describes the general supported feature specification for extension indexes:

Schema design:

For the base index to be able to reference an extension index, the extension index schema must define what is similar to a foreign key that matches the unique field name and type in the base index schema. The referenced field data type must be a simple data type such as String, Integer, or float. It must match the unique key name and type of the base index.
Avoid common field names between extension indexes and the base index, other than the referenced field. It is recommended to use a naming convention that prefixes the extension index fields to avoid naming collisions.

Searching:

The q parameter is the only mandatory query parameter. It must be a query specified in SolrQuerySyntax. For more information, see https://wiki.apache.org/solr/SolrQuerySyntax
Any index column, including index columns from an extension index, can be used in a query expression. However, there will be a performance degradation when performing a cross-index query condition, which is typically not recommended.

Filtering:

The fq parameter can be used to specify a query that can be used to restrict the super set of documents that can be returned, without influencing the relevancy score. It can be very useful for speeding up complex queries, since the queries specified with fq are cached independently from the main query.
This parameter can be specified multiple times in the same request. Documents will only be included in the result if they are in the intersection of the document sets resulting from each fq.
Filter queries can be complicated boolean queries, but fields involved must belong to the same index. That is, cross core filter queries are not supported.
The document sets from each filter query are cached independently.
Special characters must be properly URL escaped, as with all parameters when expressed in a URL.

Faceting:

Faceting is done on indexed values, rather than stored values. This is because the primary use for faceting is to select a subset of hits resulting from a query, so the chosen facet value is used to construct a filter query which literally matches that value in the index, while the stored value is for the purpose of display only.
There are many faceting parameters that can be overridden on a per-field basis, using the following syntax: f.fieldName.parameterName=parameterValue.
Specific filters can be tagged or excluded when faceting. This is typically needed when selecting multiple facets.
Pivot facet, facet by date, and facet by range are not supported.

Sorting:

A sort ordering must include a field name, either from the base index or extension index, followed by whitespace, followed by a sort directional operator (ascending or descending).
Only field names can be used. Function names or sorting by docId is not supported.
Multiple sort operators can be separated by a comma. When more than one sort criteria is provided, the second entry will only be used if the first entry results in a tie. If there is a third entry, it will only be used if the first and second entries are tied. This pattern continues with further entries.

Grouping:

Result grouping arranges documents with a common field value into groups, returning the top documents per group, and the top groups based on what documents are in the groups. Grouping can only be performed against index columns from the base index. That is, grouping by extension index fields is not supported.

Joining:

Join operations can only be used against the base index. Joins with an extension index is not supported.
Fields or other properties of the documents being joined from are not available for use in processing of the resulting set of to documents. That is, you cannot return fields in the from documents as if they were a multivalued field on the to documents.
The Join query produces constant scores for all documents that match. The scores that are computed by the nested query for the from documents are not available to use in scoring the to documents.

Common query parameters

The following list describes the common query parameters and restrictions when specified against the base index:

start

Paginates results from a query. When specified, it indicates the offset in the complete result set for the queries where the set of returned documents should begin. The default value is 0.

rows

Paginates results from a query. It specifies the maximum number of documents from the complete result set to return to the client for every request. You can consider it as the maximum number of results that appear in the page.

fl (fields)

Specifies a set of fields to return, limiting the amount of information in the response. Any index column, including index columns from an extension index, can be declared in this parameter for returning the stored value of the corresponding index field. When no index field, or a * is provided, all stored static index fields from the base (CatalogEntry) index will be returned. Dynamic fields and other extension index fields must be explicitly declared to be returned.

Note: Use of the fl parameter can result in considerable performance degradation.

facet

Determines the Simple Faceting behavior, grouped by the type of faceting they support. Setting this parameter to true enables facet counts in the query response. That default value is false, which will disable faceting.

facet.query

Specifies an arbitrary query in the Lucene default syntax to generate a facet count. By default, faceting returns a count of the unique terms for a field, while facet.query allows you to determine counts for arbitrary terms or expressions. This parameter can be specified multiple times to indicate that multiple queries should be used as separate facet constraints.

facet.field

Specifies a field which should be treated as a facet. This parameter can be specified multiple times to indicate multiple facet fields.

facet.prefix

Limits the terms on which to facet to those starting with the given string prefix. Unlike fq, this does not change the search results; it merely reduces the facet values returned to those beginning with the specified prefix. This parameter can only be specified on a per-field basis, and has no additional affect on other faceting fields.

facet.sort

Determines the ordering of the facet field constraints. There are two values that can be used:

count: sort the constraints by count (highest count first).
index: return the constraints sorted in their index order (lexicographic by indexed term).

For terms in the ASCII range, this will be alphabetically sorted. The default sorting is by count when facet.limit is greater than 0. Otherwise, the default is set to index. This parameter can only be specified on a per-field basis and has no additional affect on other faceting fields.

facet.limit

Indicates the maximum number of constraint counts that should be returned for the facet fields. A negative value denotes unlimited. The default value is 100. This parameter can be specified on a per-field basis to indicate a separate limit for certain fields.

facet.offset

Indicates an offset into the list of constraints to allow paging. The default value is 0. This parameter can only be specified on a per-field basis and has no additional affect on other faceting fields.

facet.mincount

Indicates the minimum counts for facet fields should be included in the response. The default value is 0. This parameter can only be specified on a per-field basis and has no additional affect on other faceting fields.

qf (query fields)

Provides a list of index fields and the boost factor to associate with each of them when building DisjunctionMaxQueries from the search request. The supported format is

fieldOne^2.3
fieldTwo fieldThree^0.4

, which indicates that fieldOne has a boost of 2.3, fieldTwo has the default boost, and fieldThree has a boost of 0.4. This indicates that matches in fieldOne are much more significant than matches in fieldTwo, which are more significant than matches in fieldThree.

bq (boost query)

Defines a raw query string (in the SolrQuerySyntax) that will be included with the search query to influence the score. If this is a BooleanQuery with a default boost (1.0f), then the individual clauses will be added directly to the main query. Otherwise, the query will be included as-is. Any index column, including index columns from an extension index, can be used in a boost query expression. However, because boost queries are handled the same way as a normal query, the same restriction applies, where there will be a performance degradation when performing a cross-index query condition which is typically not recommended. This parameter can be specified multiple times to indicate multiple boost queries.

bf (additive boost function)

Defines a function (with optional boosts) that can be included in the search query to influence its score. Any function that is natively supported by Solr can be used, along with a boost value. This parameter is equivalent to using the _val_:"...function..." syntax in a bq parameter. This parameter can be specified multiple times to indicate multiple additive boost functions.

boost (multiplicative boost function)

This parameter has the same syntax as bf, with the exception that the boost factor specified is multiplied into the score. This parameter can be specified multiple times to indicate multiple multiplicative boost functions.

Performance considerations

Consider the following usage when an extension index such as Inventory exists in WebSphere Commerce search:

The filterCache and documentCache are required on the product index when an extension index such as Inventory exists in WebSphere Commerce search, so that the query component functions correctly.
You should typically disable all other internal Solr caches for the extension index in the search runtime.