WebSphere Commerce Search extension indexes
Extension indexes, set up as index subtypes in WebSphere Commerce Search, are used to keep data in a separate core for performance reasons.
The following extension indexes are available by default:
- Inventory
- The inventory index, a separate index that contains index data, is an extension of the product index. For accurate inventory status, you can refresh the inventory index more frequently than the product index.
- Price
- The price index, a separate index that contains price data. Prices are indexed by using Index Load. It can populate a large amount of data into a separate extension index faster than the Catalog Entry index can index price data.
You can use the default extension indexes, or setup your own extension index that best match your site's requirements.
General considerations
The WebSphere Commerce Search query component extension tries to replicate most of the Solr supported search features when you work with an extension index. However, due to the complexity of the logic that is involved at run time, the following list describes the general supported feature specification for extension indexes:
Schema design:
- For the base index to be able to reference an extension index, the extension index schema must define what is similar to a foreign key. It matches the unique field name and type in the base index schema. The referenced field data type must be a simple data type such as String, Integer, or float. It must match the unique key name and type of the base index.
- Avoid common field names between extension indexes and the base index, other than the referenced field. It is recommended to use a naming convention that prefixes the extension index fields to avoid naming collisions.
Searching:
- The
q
parameter is the only mandatory query parameter. It must be a query that is specified in SolrQuerySyntax. - Any index column, including index columns from an extension index, can be used in a query expression. However, there is a performance degradation when performing a cross-index query condition, which is typically not recommended.
Filtering:
- The
fq
parameter can be used to specify a query that can be used to restrict the super set of documents that can be returned, without influencing the relevancy score. It can be useful for speeding up complex queries, since the queries specified withfq
are cached independently from the main query. - This parameter can be specified multiple times in the same request. Documents are included in
the result only if they are in the intersection of the document sets resulting from each
fq
. - Filter queries can be complicated Boolean queries, but fields that are involved must belong to the same index. That is, cross core filter queries are not supported.
- The document sets from each filter query are cached independently.
- Special characters must be properly URL escaped, as with all parameters when expressed in a URL.
Faceting:
- Faceting is done on indexed values, rather than stored values. This is because the primary use for faceting is to select a subset of hits that result from a query, so the chosen facet value is used to construct a filter query that literally matches the value in the index, while the stored value is for display purposes only.
- Many faceting parameters can be overridden on a per-field basis, by using the following syntax:
f.fieldName.parameterName=parameterValue
. - Specific filters can be tagged or excluded when faceting. Typically tagging or exclusion are needed when multiple facets are selected. However, tagging and exclusion within the same query are restricted to fields from the same core. That is, tagging and exclusion within the same query that involves fields from more than a single core is not supported.
- Pivot facet, facet by date, and facet by range are not supported.
Sorting:
- A sort order must include a field name, either from the base index or extension index, followed by white space, followed by a sort directional operator (ascending or descending).
- Only field names can be used. Function names or sorting by docId is not supported.
- Multiple sort operators can be separated by a comma. When more than one sort criteria is provided, the second entry is used only if the first entry results in a tie. If there is a third entry, it is used only if the first and second entries are tied. This pattern continues with further entries.
Grouping:
- Result grouping arranges documents with a common field value into groups, returning the top documents per group, and the top groups based on what documents are in the groups. Grouping can be performed only against index columns from the base index. That is, grouping by extension index fields is not supported.
Joining:
- Join operations can be used only against the base index. Joins with an extension index is not supported.
- Fields or other properties of the documents that are joined
from
are not available for use in processing of the resulting set ofto
documents. That is, you cannot return fields in thefrom
documents as if they are a multivalued field on theto
documents. - The join query produces constant scores for all documents that match. The scores that are
computed by the nested query for the
from
documents are not available to use in scoring theto
documents.
Common query parameters
The following list describes the common query parameters and restrictions when specified against
the base index:
- start
- Paginates results from a query. When specified, it indicates the offset in the complete result set for the queries where the set of returned documents begins. The default value is 0.
- rows
- Paginates results from a query. It specifies the maximum number of documents from the complete result set to return to the client for every request. You can consider it as the maximum number of results that appear in the page.
- fl (fields)
- Specifies a set of fields to return, limiting the amount of information in the response. Any
index column, including index columns from an extension index, can be declared in this parameter for
returning the stored value of the corresponding index field. When no index field is provided, or a
*
is provided, all stored static index fields from the base (CatalogEntry) index are returned. Dynamic fields and other extension index fields must be explicitly declared to be returned.Note: Use of the fl parameter can result in considerable performance degradation. - facet
- Determines the Simple Faceting behavior, which is grouped by the type of faceting they support. Setting this parameter to true enables facet counts in the query response. That default value is false, which disables faceting.
- facet.query
- Specifies an arbitrary query in the Lucene default syntax to generate a facet count. By default, faceting returns a count of the unique terms for a field, while facet.query determines counts for arbitrary terms or expressions. This parameter can be specified multiple times to indicate that multiple queries are used as separate facet constraints.
- facet.field
- Specifies a field that is treated as a facet. This parameter can be specified multiple times to indicate multiple facet fields.
- facet.prefix
- Limits the terms on which to facet, starting with the specified string prefix. Unlike
fq
, it does not change the search results; it merely reduces the facet values returned to those beginning with the specified prefix. This parameter can be specified only on a per-field basis, and has no additional effect on other faceting fields. - facet.sort
- Determines the ordering of the facet field constraints. The following values can be used:
- count: sort the constraints by count (highest count first).
- index: return the constraints that are sorted in their index order (lexicographic by indexed term).
- facet.limit
- Indicates the maximum number of constraint counts to be returned for the facet fields. A negative value denotes unlimited. The default value is 100. This parameter can be specified on a per-field basis to indicate a separate limit for certain fields.
- facet.offset
- Indicates an offset into the list of constraints to allow paging. The default value is 0. This parameter can be specified only on a per-field basis and has no additional effect on other faceting fields.
- facet.mincount
- Indicates that the minimum counts for facet fields to be included in the response. The default value is 0. This parameter can be specified only on a per-field basis and has no additional effect on other faceting fields.
- qf (query fields)
- Provides a list of index fields and the boost factor to associate with each of them when
building DisjunctionMaxQueries from the search request. The supported format is
fieldOne^2.3 fieldTwo fieldThree^0.4
, which indicates that fieldOne has a boost of 2.3, fieldTwo has the default boost, and fieldThree has a boost of 0.4. This indicates that matches in fieldOne are much more significant than matches in fieldTwo, which are more significant than matches in fieldThree. - bq (boost query)
- Defines a raw query string (in the SolrQuerySyntax) that are included with the search query to influence the score. If this is a BooleanQuery with a default boost (1.0f), then the individual clauses are added directly to the main query. Otherwise, the query is included as-is. Any index column, including index columns from an extension index, can be used in a boost query expression. However, because boost queries are handled the same way as a normal query, the same restriction applies, where there is a performance degradation when performing a cross-index query condition, which is typically not recommended. This parameter can be specified multiple times to indicate multiple boost queries.
- bf (additive boost function)
- Defines a function (with optional boosts) that can be included in the search query to influence
its score. Any function that is natively supported by Solr can be used, along with a boost value.
This parameter is equivalent to using the
_val_:"...function..."
syntax in abq
parameter. This parameter can be specified multiple times to indicate multiple additive boost functions. - boost (multiplicative boost function)
- This parameter has the same syntax as bf, with the exception that the boost factor specified is multiplied into the score. This parameter can be specified multiple times to indicate multiple multiplicative boost functions.
Performance considerations
Consider the following usage when an extension index such as Inventory exists in WebSphere
Commerce search:
- The filterCache and documentCache are required on the product index when an extension index such as Inventory exists in WebSphere Commerce Search, so that the query component functions correctly.
- You should typically disable all other internal Solr caches for the extension index in the search run time.