Customizing the preprocessor and Data Import Handler (DIH)
The preprocessing tasks are controlled by the wc-dataimport-preprocess XML files. The files contain table definitions, database schema metadata, and references to the Java classes used in the preprocessing steps. Those files are invoked by the di-preprocess utility, which extracts and flattens WebSphere Commerce data and then outputs the data into a set of temporary tables inside the WebSphere Commerce database. The data in the temporary tables is then used by the index building utility to populate the data into Solr indexes that use the Solr Data Import Handler (DIH).
Preprocessor
- WC_installdir\IBM\WCDE_INT70\components\foundation\samples\dataimport\catalog
To customize the preprocessor, create a new specific wc-dataimport-preprocess XML file. The custom wc-dataimport-preprocess XML file is invoked by the di-preprocess utility.
<_config:DIHPreProcessConfig xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config
../../xsd/wc-dataimport-preprocess.xsd">
<_config:data-processing-config processor="com.ibm.commerce.foundation.dataimport.preprocess.EmptyPreProcessor"
masterCatalogId="#MASTER_CATALOG_ID#">
<_config:table definition="" name="" />
<_config:query sql="" />
<_config:mapping>
<_config:key queryColumn="" tableColumn=""/>
<_config:column-mapping>
<_config:column-column-mapping>
<_config:column-column queryColumn="" tableColumn="" />
</_config:column-column-mapping>
</_config:column-mapping>
</_config:mapping>
</_config:data-processing-config>
</_config:DIHPreProcessConfig>
For more information about customizing the preprocessor, see Configuring the Data Import Handler mapping.
Solr Data Import Handler (DIH)
The index building utility uses DIH to connect to the WebSphere Commerce database through a JDBC connection. It crawls the WebSphere Commerce tables, and then populates the Solr index. The JDBC configuration and crawling SQL statements are defined in the wc-data-config.xml configuration file.
- Add the new field name to the
SELECT
clause of thequery
and thedeltaImportQuery
. - Add the source table name of the new field to the
FROM
clause of thequery
and thedeltaImportQuery
. - Add a field mapping to the actual index field name in the index.
- Append the list of fields under the
SELECT
clause with the following snippet:CATENTRY.FIELD1, CATENTRY.FIELD3, CATENTRY.FIELD5,
- The CATENTRY table is already included
in the list of tables of the
FROM
clause. - Add the mapping for each of the fields:
<field column=" FIELD1" name="catentry_field1" /> <field column=" FIELD3" name="catentry_field3" /> <field column=" FIELD5" name="catentry_field5" />
Naming conventions
Use a prefix of XI_ when naming custom temporary tables. This naming convention prevents naming conflicts between customization tables and default WebSphere Commerce tables.