di-preprocess utility

The di-preprocess utility extracts and flattens WebSphere Commerce data and then outputs the data into a set of temporary tables inside the WebSphere Commerce database. The data in the temporary tables is then used by the index building utility to populate the data into search indexes with the Data Import Handler (DIH).

The preprocess utility picks the wc-dataimport-preprocess-fullbuild.xml file or wc-dataimport-preprocess-deltabuild.xml file first, and then transforms the results of the SQL statements defined in those files into temporary tables. Next, the utility handles each configuration XML file in a random order.

Syntax diagram for di-preprocess utility

Parameter values

full-path
Required: The full directory location of the preprocessing configuration files, for example: WC_installdir/instances/instance name/search/pre-processConfig/MC_masterCatalogId/target_db.
The names of these files start with wc-dataimport-preprocess, for example, wc-dataimport-preprocess-fullbuild.xml. The search index setup utility installs this set of files when you deploy WebSphere Commerce Search.
Tip: To preprocess both the catalog entry index and the category index at the same time, specify the full path for the catalog entry index only. It then processes the configuration files for both the catalog entry index and the category index by default. This is because the directory which contains the configuration files for the category index (called CatalogGroup) is located one level below the directory for the catalog entry index (called CatalogEntry).
For the catalog entry index: by default, the configuration files install at the following path:
  • LinuxAIXWindowsWC_installdir/instances/instance_name/search/pre-processConfig/MC_masterCatalogId/databaseType
  • For IBM i OS operating systemWC_instance_root/instances/instance_name/search/pre-processConfig/MC_masterCatalogId/databaseType
  • WebSphere Commerce DeveloperWCDE_installdir/search/pre-processConfig/MC_masterCatalogId/databaseType
For the category index: by default, the configuration files install in a directory called CatalogGroup, one level below the catalog entry index configuration files, for example:
  • LinuxAIXWindowsWC_installdir/instances/instance_name/search/pre-processConfig/MC_masterCatalogId/databaseType/CatalogGroup
  • For IBM i OS operating systemWC_instance_root/instances/instance_name/search/pre-processConfig/MC_masterCatalogId/databaseType/CatalogGroup
  • WebSphere Commerce DeveloperWCDE_installdir/search/pre-processConfig/MC_masterCatalogId/databaseType/CatalogGroup
For the inventory index: by default, the configuration files install in a directory called Inventory, one level below the catalog entry index configuration files, for example:
  • LinuxAIXWindowsWC_installdir/instances/instance_name/search/pre-processConfig/MC_masterCatalogId/databaseType/SubTypes/Inventory
  • For IBM i OS operating systemWC_instance_root/instances/instance_name/search/pre-processConfig/MC_masterCatalogId/databaseType/SubTypes/Inventory
  • WebSphere Commerce DeveloperWCDE_installdir/search/pre-processConfig/MC_masterCatalogId/databaseType/SubTypes/Inventory
instance
The name of the WebSphere Commerce instance with which you are working (for example, demo).
dbuser

DB2The name of the user who is connecting to the database.

OracleThe user ID connecting to the database. If you are using workspaces, the database user must be granted cross-schema privileges to create and drop tables. Otherwise, you cannot preview changes made in workspaces.

dbuserpwd
The password for the user who is connecting to the database.
Alternatively, you can use the passwordFile parameter to specify the encrypted password from a file.
fullbuild
Optional: A flag that indicates whether it is a full index build. The accepted values are either true or false. The default value is true.
localename
Optional: The locale that should be indexed. The accepted values are either:
  • All
Or one of the following values:
  • de_DE
  • en_US
  • es_ES
  • fr_FR
  • it_IT
  • ja_JP
  • ko_KR
  • pl_PL
  • pt_BR
  • ro_RO
  • ru_RU
  • zh_CN
  • zh_TW
The default value is All.
onelevel
Optional: A flag you can use to save time when setting up preprocessing for the catalog entry index (CatalogEntry indextype) and the category index (CatalogGroup indextype) at the same time. If you set the onelevel flag to true, then for the full-path value, you only need to specify the path to the preprocessing configuration files for the catalog entry index. The utility will automatically look for the category index files in the CatalogGroup directory one level down.

Example:

  • Instead of specifying both paths for your full-path value, as shown here:
    C:/Program Files/IBM/WebSphere/CommerceServer80/instances/demo/search/pre-processConfig/MC_10001/DB2,
    C:/Program Files/IBM/WebSphere/CommerceServer80/instances/demo/search/pre-processConfig/MC_10001/DB2/CatalogGroup
  • Specify only the first path, as shown here:
    C:/Program Files/IBM/WebSphere/CommerceServer80/instances/demo/search/pre-processConfig/MC_10001/DB2

The default value of the onelevel flag is true.

multithread
Optional: Preprocesses data by using multiple threads.
The number of threads used is based on the number of existing wc-dataimport-preprocess-XXXXX.xml files, excluding the wc-dataimport-preprocess-fullbuild.xml and wc-dataimport-preprocess-deltaupdate.xml files.
The default value is false.
workspace
The workspace index to preprocess. This value is case-sensitive. If specified, the specified workspace index is preprocessed. If not specified, the base schema index is preprocessed. The default value is to preprocess the base schema index.
To get the workspace ID, either:
  • Open the workspace in the Workspace Management tool in the Management Center. The workspace code is the workspace ID; or
  • If the workspace has an active task group, run the following SQL query: select * from cmwsschema, where the workspace ID is listed under the workspace column.
OracledbURL
OracleThe database URL the utility uses to connect to the database. If not provided, the utility constructs a database URL based on the default database value.
skipDeltaNoEntry
Optional: When delta preprocessing (fullbuild set to false), the utility checks if there are any delta updates to perform. The utility ends if there are no delta updates to perform. Otherwise, the preprocessing is performed as expected.
If this parameter is set to false, and there are no delta updates to perform, the delta preprocessing updates all of the temporary tables to empty. This might save time, where the utility would otherwise check all the tables and process no records.
The default value is false.
publishedOnly
Optional: Allows only products from published categories to be displayed in the keyword search results when deep category unpublish is enabled.
The default value is false.
deepUnpublish
Optional: Enables preprocessing for the deep category unpublish feature.
The default value is false.
For more information, see Hiding categories and products using deep category unpublish.
deepSequence
Optional: Enables preprocessing for the deep search sequencing feature.
The default value is false.
For more information, see Hiding categories and products using deep category unpublish.
passwordFile
Optional: The full path to the password.properties file that contains the password for the user who is connecting to the database. For example, C:\password.properties.
The password.properties file contains the following content:

dbUserPassword=encrypted_pwd
Where encrypted_pwd is the password that has been encrypted using the wcs_encrypt utility.
nonLangTables
Optional: Preprocesses only language-insensitive tables.
Language-insensitive tables use the following naming convention: TI_string_number. For example, TI_CATENTRY_0.
The default value is false.
langTables
Optional: Preprocesses only language-sensitive tables.
Language-sensitive tables use the following naming convention: TI_string_number_number. For example, TI_ATTR_0_1 for United States English.
The default value is false.
Usage notes for language table parameters:
  • When both nonLangTables and langTables values are set to false, all tables are preprocessed.
  • If setting both values to true, first run the utility with nonLangTables, then run the utility with langTables.
  • For sites with many supported languages enabled, you can preprocess only language-insensitive tables first, and then preprocess language-sensitive tables in parallel.
dropTempTable
Indicates whether to drop tables when preprocessing the search index.
Passing in a value of false uses a TRUNCATE statement on the tables.
The default value is true, which uses a DROP statement on the tables.
Note: This parameter supports only DB2 9.7 or later, or Oracle databases.
propFile
The full path to the properties file to pass in to the utility.
force
When set to true, forces the utility to run, even if other processes are in progress. Ensure that this parameter value matches for both the di-preprocess and di-buildindex utilities, otherwise the utility encounters errors and fail to run.
Note: The index build fails when two or more users or automated processes simultaneously run a preprocess with this parameter. This caution applies to both delta builds and complete index rebuilds. Unexpected results include but are not limited to inconsistent or incomplete indexes.
WebSphere Commerce Version 8.0.3.0 or latergrouping
WebSphere Commerce Version 8.0.3.0 or laterIf you have enabled the product grouping feature, then set -grouping to true when running the di-preprocess command. Grouping disables the rolling-up of defining attribute dictionary attribute values, from ItemBean to ProductBean, and will enable rolling down of descriptive attributes values from ProductBean to ItemBean. Please refer to Product grouping.

Example

From the following directory on your WebSphere Commerce machine:
  • LinuxAIXFor IBM i OS operating system WC_installdir/bin
  • WebSphere Commerce Developer WCDE_installdir\bin
Run the following command:
  • Windowsdi-preprocess.bat full-path -instance instance_name -dbuser dbuser -dbuserpwd dbuserpwd [-fullbuild true | false] [-localename localename] [-onelevel true | false] [-multithread true | false] [-workspace workspaceId] [-dbURL dbURL] [-skipDeltaNoEntry skipDeltaNoEntry] [-passwordFile passwordFile] [-nonLangTables nonLangTables] [-langTables langTables] [-publishedOnly true | false] [-deepUnpublish true | false] [-deepSequence true | false] [-dropTempTable true | false] [-propFile propFile] [-force true | false] [-grouping true | false]
  • LinuxAIXFor IBM i OS operating systemdi-preprocess.sh full-path -instance instance_name -dbuser dbuser -dbuserpwd dbuserpwd [-fullbuild true | false] [-localename localename] [-onelevel true | false] [-multithread true | false] [-workspace workspaceId] [-dbURL dbURL] [-skipDeltaNoEntry skipDeltaNoEntry] [-passwordFile passwordFile] [-nonLangTables nonLangTables] [-langTables langTables] [-publishedOnly true | false] [-deepUnpublish true | false] [-deepSequence true | false] [-dropTempTable true | false] [-propFile propFile]
  • WebSphere Commerce Developer-masterCatalogId masterCatalogId [-fullbuild true | false] [-localename localename] [-onelevel true | false] [-multithread true | false] [-dbURL dbURL] [-skipDeltaNoEntry skipDeltaNoEntry] [-passwordFile passwordFile] [-nonLangTables nonLangTables] [-langTables langTables] [-publishedOnly true | false] [-deepUnpublish true | false] [-deepSequence true | false]
Verify that the output from the script contains no errors and that the last part of the output contains the following lines:
"Program exiting with exit code: 0.
Data import preprocessing completed successfully with no errors."
Also, inspect the following file for errors:
  • WebSphere Commerce DeveloperWCDE_installdir\logs\wc-dataimport-preprocess.log
  • LinuxAIXWindowsWC_installdir\logs\wc-dataimport-preprocess.log
For more information about exit codes, see WebSphere Commerce Search utility exit codes.
To get more logging information, update the logging level from INFO to FINEST in the following file:
  • WebSphere Commerce DeveloperWCDE_installdir\workspace\WC\xml\config\dataimport\logging.properties
  • LinuxAIXWindowsWC_installdir/instances/instance_name/xml/config/dataimport/logging.properties
  • For IBM i OS operating systemWC_instance_root/xml/config/dataimport/logging.properties
# Default global logging level, INFO
com.ibm.commerce.level=FINEST