Enabling Solr search sharding within a Docker-based deployment

Solr search sharding is included as an optional configuration with a provided Docker Compose file. This enables Solr-based search deployments with large indexes to enable multiple Java Virtual Machines (JVMs) to complete indexing work in parallel, reducing indexing time, and alleviating any resource issues that can be encountered when using a single JVM.

Before you begin

  1. Review Solr search sharding within the Search documentation.
  2. HCL Commerce Version 9.1.14.0 or laterBuild a customized search-app image to avoid permission issues.

    Use this sample Dockerfile to set the appropriate permissions.

    Beginning with HCL Commerce 9.1.14.0, the default user for running HCL Commerce application containers was updated to be a non-root user. For more information, see HCL Commerce container users and privileges.

Procedure

  1. Update your Docker Compose deployment configuration file (docker-compose.yml) to include the configuration for your search shards.
    Use this sample Docker Compose configuration file to create your shard definitions.
  2. Start your deployment.
  3. In the running Utility server Docker container, modify /opt/WebSphere/CommerceServer90/properties/parallelprocess/di-parallel-process-multijvms.properties to match your environment. You can use the following examples as a guide.
    1. Enter the running Utility server Docker container.
    2. Open the /opt/WebSphere/CommerceServer90/properties/parallelprocess/di-parallel-process-multijvms.properties configuration file for editing.
      • Configure the hostname and port for the individual shard servers as follows.
        Shard.A.common.index-server-name=shard_a
        Shard.A.common.index-server-port=3738
        …
        Shard.B.common.index-server-name=shard_b
        Shard.B.common.index-server-port=3738
        Each shard server name should be the same as the hostname/alias configured within your deployment.
      • Configure the shard index core directory as follows.
        Shard.A.en_US.unstructured-index-core-dir=/shard_a/index/solr/MC_10001/en_US/Unstructured_A/
        Shard.A.en_US.structured-index-core-dir=/shard_a/index/solr/MC_10001/en_US/CatalogEntry_A/
        …
        Shard.B.en_US.unstructured-index-core-dir=/shard_b/index/solr/MC_10001/en_US/Unstructured_B/
        Shard.B.en_US.structured-index-core-dir=/shard_b/index/solr/MC_10001/en_US/CatalogEntry_B/
        Note: The directory should be the absolute path inside each shard container.

        You can automate the shard configuration process. This is useful if, for example, you expect to create a large number of shards. The auto-sharding process will automatically configure properties such as preprocessing-start-range-value, preprocessing-end-range-value, index-core-name and index-core-dir.

        For information on how to set up auto-sharding, see Sharding input properties file.

      • Save and close the file.
      Tip: If you want to persist the changes made to your properties files within your Utility server Docker container container, you must build a customized ts-utils image and use this customized image for your deployment.
  4. Start the indexing process with sharding in multiple JVMs enabled.
    1. Navigate to the /opt/WebSphere/CommerceServer90/bin/ directory.
    2. Run the following command.
      ./di-parallel-process.sh /opt/WebSphere/CommerceServer90/properties/parallelprocess/di-parallel-process-multijvms.properties
      For more information, see Running utilities from the Utility server Docker container.

Results

Your configured shards are used to process your Solr index in parallel. Once the merge operation is complete, the merged index will be online and immediately available for use.