Customizing the preprocessor and Data Import Handler (DIH)
The preprocessing tasks are controlled by the wc-dataimport-preprocess XML files. The files contain table definitions, database schema metadata, and references to the Java classes used in the preprocessing steps. Those files are invoked by the di-preprocess utility, which extracts and flattens WebSphere Commerce data and then outputs the data into a set of temporary tables inside the WebSphere Commerce database. The data in the temporary tables is then used by the index building utility to populate the data into Solr indexes that use the Solr Data Import Handler (DIH).
Preprocessor
- WCDE_installdir\samples\dataimport\catalog
To customize the preprocessor, create a new specific wc-dataimport-preprocess XML file. The custom wc-dataimport-preprocess XML file is invoked by the di-preprocess utility.
<_config:DIHPreProcessConfig xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config
../../xsd/wc-dataimport-preprocess.xsd">
<_config:data-processing-config processor="com.ibm.commerce.foundation.dataimport.preprocess.EmptyPreProcessor"
masterCatalogId="#MASTER_CATALOG_ID#">
<_config:table definition="" name="" />
<_config:query sql="" />
<_config:mapping>
<_config:key queryColumn="" tableColumn=""/>
<_config:column-mapping>
<_config:column-column-mapping>
<_config:column-column queryColumn="" tableColumn="" />
</_config:column-column-mapping>
</_config:column-mapping>
</_config:mapping>
</_config:data-processing-config>
</_config:DIHPreProcessConfig>
Solr Data Import Handler (DIH)
The index building utility uses DIH to connect to the WebSphere Commerce database through a JDBC connection. It crawls the WebSphere Commerce tables, and then populates the Solr index. The JDBC configuration and crawling SQL statements are defined in the wc-data-config.xml configuration file.
- Set the dataImporter.ext.querySelect property value to the new field name, followed by a comma in the solrcore.properties file.
- Set the dataImporter.ext.queryFrom source table name of the new field in
the solrcore.properties file.Note: You must use a LEFT OUTER JOIN statement in this property.
- Add a field mapping between the SQL column name and the actual index field name in the x-data-config.xml file.
- Add the following property to the solrcore.properties
file:
dataImporter.ext.querySelect=CATENTRY.FIELD1, CATENTRY.FIELD3, CATENTRY.FIELD5,
- Add the mapping for each of the fields into the x-data-config.xml
file:
<field column=" FIELD1" name="catentry_field1" /> <field column=" FIELD3" name="catentry_field3" /> <field column=" FIELD5" name="catentry_field5" />
Naming conventions
Use a prefix of XI_ when naming custom temporary tables. This naming convention prevents naming conflicts between customization tables and default WebSphere Commerce tables.