Crawling WebSphere Commerce site content
You can use the site content crawler utility
to crawl WebSphere Commerce site content in starter stores.
Before you begin
- Ensure that the test server is started.
- Ensure that your administrative server is started. For example:
- If WebSphere Commerce is managed by WebSphere Application Server Deployment Manager (dmgr), start the deployment manager and all node agents. Your cluster can also be started.
- If WebSphere Commerce is not managed by WebSphere Application Server Deployment Manager (dmgr), start the WebSphere Application Server server1.
- Ensure that you have completed the following task:
- Important: Ensure that you have configured
the site content crawler configuration files for your site:
- droidConfig.xml
- filters.txt
Note: To index site content, you must either set auto index to true in the droidConfig.xml file, or pass the -basePath -storeId -localename parameters to the di-buildindex utility.
Note: For crawling site content in a clustered environment:
- The crawler must be run from a staging environment. That is, crawling should not be performed in a production environment. If the production content must be crawled, the crawler must be configured to hit the production site instead of running directly from production environment. This method simplifies the setup by restricting the crawler to run in a WebSphere Commerce staging environment and updating the index in the repeater.
- When managing index configurations in a clustered environment, assuming a deployment manager is used to manage the Solr EAR in a cluster, each Solr node is considered as a search index subordinate that replicates against the repeater. Each subordinate Solr node has its own local configuration and search index directories and configuration files, with the index synchronized across the entire cluster through Solr replication. That is, the deployment manager manages the Solr EAR, while the local index copy is managed by the repeater through Solr replication.