Site content crawling remotely
When using WebSphere Commerce search, you must
consider additional factors when building the search index for unstructured
content when running the site content crawler remotely.
When deploying WebSphere Commerce search remotely and
setting up the search index on both machines:
- The droidConfig.xml and filters.txt files are copied to the Solr home directory on the remote search server. These files are required on the WebSphere Commerce machine. Therefore, you must copy the files to your WebSphere Commerce machine for use.
- The base path in the SRCHCONFEXT table points to the store directory on the WebSphere Commerce machine.
When running the build index utility, the WebContent is
not built because the manifest.txt file is not found at the specified
path from the remote machine:
- On the remote machine, create a mapped network drive to the root directory on the WebSphere Commerce machine. For example, mapping the entire C:\ of the WebSphere Commerce machine to Z:\ on the remote machine.
- Update the base path to point to the mapped drive. For example, in the above mapping, change BasePath=C:\rest_of_base_path to BasePath=Z:\rest_of_base_path.
- Run the build index utility and the WebContent index is built successfully.
When running the crawler:
- Run the crawler on the WebSphere Commerce machine, updating the database but not automatically indexing. This results in updating the database with a location of C:\, instead of the mapped drive. You must update the database to use the mapped Z:\ instead.
- Run the crawler on the WebSphere Commerce machine without updating the database, but with automatic indexing. This results in building the index, with the latest manifest.txt stored either locally or on the mapped drive.
Additional notes when working with the crawler:
- The basePath parameter is passed to the di-buildindex utility in WebSphere Commerce Developer. Runtime can read the value from the SRCHCONFEXT table.
- The basePath value is initially set into the SRCHCONFEXT table by the setupSearchIndex utility, relative to the WebSphere Commerce server.
- The basePath value is updated every time the crawler is run, only if the database information is passed to the crawler utility
- In remote configurations, the manifest.txt file and the generated files must be mounted to the remote search server, and the basePath updated to match the new network drive.
- If automatic indexing is enabled in the droidConfig.xml file, data can be indexed directly without looking up the basePath parameter from the database.
- The crawler is a WebSphere Commerce utility. Therefore it must be run on the WebSphere Commerce server.