WebSphere Commerce Search performance tuning
There are several search performance tuning hints and tips to consider when administering WebSphere Commerce Search.
Indexing server
Consider the following factors when you tune the indexing server:
Search caching for the indexing server
Typically disable all Solr caches on the indexing server.
When to perform full search index builds
The WebSphere Commerce Search index is automatically built when certain business tasks are performed, as outlined in Common business tasks and their impact to the WebSphere Commerce Search index. In several cases, common business tasks result in delta index builds that do not pose a significant risk to production system performance. However, performing several delta index builds without occasional full index builds might result in the search index gradually degrading over time due to fragmentation. To avoid this issue, performing full search index builds when possible ensures that the search index performs well over time.
When Lucene receives a delete request, it does not delete entries from the index, but instead marks them for deletion and adds updated records to the end of the index. This results in the catalog unevenly spreading out across different segment data files in the search index, and might result in increased search response times. If you have a dedicated indexing server, consider scheduling a full search index build that runs in the background once per month, so that the deleted entries are flushed out, and to optimize the data.
Tuning index buffer size and commit actions during data import (buildindex)
- Allocate more memory for index buffering by changing the value for the
ramBufferSizeMB parameter. 2048 MB is the maximum memory that you can
allocate:
<ramBufferSizeMB>2048</ramBufferSizeMB>
- Disable the document-based count buffer setting to reduce the occurrence of commit actions by
commenting out the maxBufferedDocs
parameter:
<!-- <maxBufferedDocs>1000</maxBufferedDocs> -->
- Disable the server-side automatic commit trigger to also reduce occurrence of commit actions by
commenting out the autoCommit
trigger:
<!-- <autoCommit> <maxDocs>10000</maxDocs> <maxTime>1000</maxTime> </autoCommit> -->
- The CatalogHierarchyDataPreProcessor can improve processing speed when the fetchSize parameter is specified.
-
In WebSphere Commerce Version 8.0.0.0 CatalogHierarchyDataPreProcessor is updated to improve performance. This preprocessor, which is enabled by default, is used to inject processed data into the TI_APGROUP temporary table. TI_APGROUP becomes inefficient at large sales catalog numbers because it iterates an internal data structure and issues a query on each iteration. By specifying the fetchsize parameter, you can improve the processing speed of the preprocessor by up to 50%. The fetchsize option is a batch select process that uses
WHERE catentry_id
in any(?.?…?)
clause.The default fetchSize and batchSize of the preprocessor are each 500. The fetchSize cannot be larger than 32767 for Db2, or 1000 for Oracle.
For example:<_config:data-processing-config processor="com.ibm.commerce.foundation.dataimport.preprocess.CatalogHierarchyDataPreProcessor" masterCatalogId="10101" batchSize="500" fetchSize="1000"> ... </_config:data-processing-config>
Paging and database heap size configuration
- Increase the default paging size for your operating system. For example,
3 GB
. In cases where the operating system requires a higher paging size, adding more memory to the system also helps to resolve issues. - Increase the default database heap size to a larger value. For example, increase the DB2 heap
size to
8192
. - Increase the file descriptor limit to a higher value. For example: ulimit -n 8192.
Heap size configuration
1280
.- Using large heap sizes in WebSphere Commerce Search, for example, more than 4 GB, require a
64-bit installation of Apache Solr. That is, for example, if you intend to increase the heap size to
values greater than
1280
, ensure that you install the 64-bit version of Apache Solr. - Do not exceed 28 GB of heap size per JVM, even when you use a 64-bit environment. In a 64-bit JVM, there is an address compressed reference optimization feature that might be disabled if the heap space exceeds 28 GB, which results in up to a 30% overall throughput degradation.
Adjusting heap space when search product display is enabled
- Allocate approximately 5MB/category with product sequencing file for product sequencing:
- For Image Facet Override: ~10MB/category with image override file.
- For Sequencing and Image Override: Assuming a baseline of 100,000 products in the category,
allocate ~15MB/category with sequencing and image override file. If you are using manual sequencing
with many categories, add 1.5MB per category sequenced for each additional 100,000 products.
For example, using the 15MB/category estimate, manual sequencing of 200 categories with a catalog size of 100k can use 3GB of memory. Manual sequencing of the same 200 categories can use 6GB when the catalog size is 1.1 million. Therefore, the heap space allocated per category must be adjusted according to the catalog size.
Shared pool size configuration
Ensure that the SHARED_POOL_SIZE is configured according to your environment. Increasing the shared pool size might improve the performance of the di-preprocess utility.
ALTER SYSTEM SET SHARED_POOL_SIZE='668M' SCOPE=BOTH
Multithreaded running of SQL query expressions
Consider using multithreading in DB2 to allow for increased performance when you preprocess the search index.
To do so, update the datasource property of com.ibm.db2.jcc.DB2BaseDataSource to ANY. For more information, see Common IBM Data Server Driver for JDBC and SQLJ properties for DB2 servers.
Search runtime server
Consider the following factors when you tune the search runtime server:
Caching considerations
Search caching for the runtime production subordinate servers
The starter configuration included in the CatalogEntry solrconfig.xml file is only designed for a small scale development environment, such as WebSphere Commerce Developer.
- queryResultWindowSize
- queryResultMaxDocsCached
- queryResultCache
- filterCache (Required on the product index when an extension index such as Inventory exists)
- documentCache (Required on the product index when an extension index such as Inventory exists)
The following example demonstrates how to define cache sizes for the Catalog Entry index and its corresponding memory heap space that is required in the JVM:
- Catalog size
- 1.8 million entries.
- Total attributes
- 2000
- Total categories
- 10000
- Each product contains
- 20 attributes.
- Average size of each Catalog Entry
- 10 KB.
- queryResultWindowSize
- The size of each search result page in the storefront, such as 12 items per page. This includes two prefetch pages.
- queryResultMaxDocsCached
- For optimal performance, set this value to be the same value as queryResultWindowSize.
- queryResultCache
- The size of each queryResultCache is 4 bytes per docId (int) reference x queryResultWindowSize, for a value of 144 bytes.
- filterCache
- Assume an average search result size to be 5% of the entire catalog size of 1.8 M, or 90,000.
- documentCache
- Assume an average size of each Catalog Entry document to be 10 KB.
As a result, the estimated JVM heap size that is required for each Catalog Entry core is 4.3 GB (1.44 GB + 1.8 GB + 1.0 GB).
Managing cache sizes to conform to JVM memory
Ensure that you configure the fieldValueCache of the catalog entry index core in the solrconfig.xml file. This configuration can prevent out-of-memory issues by limiting its size to conform to JVM memory.
The cache set size depends on the facets field quantity and catalog size. The cache entry size can roughly be computed by the quantity of catalog entries in the index core, which is then multiplied by 4 bytes. That is, the potential quantity of cache entries equals the quantity of potential facets.
<fieldValueCache class="solr.FastLRUCache"
size="300"
autowarmCount="128"
showItems="32" />
solr.FastLRUCache
caching implementation does not have a hard
limit to its size. It is useful for caches that have high hit ratios, but may significantly exceed
the size value that you set. If you are using solr.FastLRUCache
, monitor your heap
utilization during peak periods. If the cache is significantly exceeding its limit, consider
changing the fieldValueCache class to solr.LRUCache
in order
to avoid performance issues or an out-of-memory condition. For more information, see Solr Caching.
Tuning the search relevancy data cache
Ensure that you tune the search relevancy data cache for your catalog size.
- service/cache/WCSearchNavigationDistributedMapCache
Each entry ranges 8 - 10 KB, containing 10 - 20 relevancy fields. The cache instance also contains other types of cache entries. The database is used for every page hit when the cache instance is full, reducing performance.
Tuning the search data cache for faceted navigation
The WebSphere Commerce Search server code uses the WebSphere Dynamic Cache facility to perform caching of database query results. Similar to the data cache used by the main WebSphere Commerce server, this caching code is referred to as the WebSphere Commerce Search server data cache.
Facet performance considerations
- Tune the size of the services/cache/WCSearchNavigationDistributedMapCache cache instance according to the number of categories.
- Tune the size of the services/cache/WCSearchAttributeDistributedMapCache cache instance according to the number of attribute dictionary facetable attributes.
- Avoid enabling many attribute dictionary faceted navigation attributes in the storefront (Show facets in search results). Avoiding many of these attributes can help avoid Solr out-of-memory issues.
Extension index considerations
- The filterCache and documentCache are required on the product index when an extension index such as Inventory exists in WebSphere Commerce Search, so that the query component functions correctly.
- You should typically disable all other internal Solr caches for the extension index in the search run time.
Configuration options
Search configuration
Ensure that you are familiar with the various Solr configuration parameters, Solr Wiki: solrconfig.xml. The documentation contains information for typical
configuration customizations that can potentially increase your search server performance. For
example, if your store contains a high number of categories or contracts, or if your search server
is receiving Too many boolean clauses
errors, increase the default value for
maxBooleanClauses.
Indexing changes and other considerations
Garbage collection
The default garbage collector policy for the WebSphere Commerce JVM is the Generational Concurrent Garbage Collector. Typically, you do not need to change this garbage collector policy.
For more information, see Generational Concurrent Garbage Collector.
Spell checking
There might be a performance impact when you enable spell checking for WebSphere Commerce Search terms.
You might see performance gains in transaction throughput if either spell checking is skipped where necessary, or when users search for products with catalog overrides.
For example, a search term that is submitted in a different language than the storefront requires resources for spell checking. However, product names with catalog overrides are already known and do not require any resources for spell checking.
The spell checker component, DirectSolrSpellChecker
, uses data directly from the
CatalogEntry index, instead of relying on a separate stand-alone index.
Improving Store Preview performance for search changes
To improve performance when you preview search changes, you can skip indexing unstructured content when business users launch Store Preview:
In the wc-component.xml file, set the IndexUnstructured
property to false
.
For more information, see Changing properties in the component configuration file (wc-component.xml) (Search EAR).
Performance monitoring
- Solr administrative interface
- The Solr native administrative interface can be used to gather runtime statistics for each Solr core that is running on the search server. It can also be used to perform simple search queries. For more information, see Enabling the Solr administrative interface.
- Lucene Index toolbox (Luke)
- Luke is a development and diagnostic tool for search indexes. It displays and modifies search index content. For more information, see Luke - Lucene Index Toolbox.
- WebSphere Application Server JMX clients
- JMX clients can read runtime statistics from Solr.