Data extraction utility for dynamic recommendations in
The Intelligent Offer data extraction utility is a command-line utility that you can use to create the Enterprise Product Report (EPR) data for dynamic recommendations that is required by . The utility extracts catalog data from your database and generates ECDF and EPCMF files in the correct format to load into IBM Product Recommendations. You can provide these two files to IBM Product Recommendations regularly for processing dynamic recommendations.
- EPCMF (Enterprise Product Content Mapping File)
- This file contains data that represents catalog entries, that is, products that can be bought, pre-built kits, and dynamic kits for a store. This file also specifies the master catalog category to which the catalog entry belongs.
- ECDF (Enterprise Category Definition File)
- This file contains data that represents the master catalog category hierarchies for a store.
Sample of the generated EPCMF file
This sample shows the catalog entry data that the utility extracts for the EPCMF file:This file contains up to 55 columns:
- The first five columns contain mandatory data that requires:
- File date
- The date that the utility created the CSV file, in YYYYMMDD format.
- Client ID
- The IBM Digital Analytics client ID.
- Item ID
- The part number of the catalog entry.
- Item
- The name of the catalog entry.
- Items Primary Category ID
- The master catalog category to which the catalog entry belongs.
- The remaining 50 columns are for customer-defined static attributes for catalog entries. Data mappings for the first six of these static attribute columns are predefined to contain specific catalog entry data, but you can change the predefined contents. For more information, see the data mapping descriptions in ../refs/rmtepcmfsample.html.
Sample of the generated ECDF file
This sample shows the catalog hierarchy data that the utility extracts for the ECDF file:- File date
- The date that the utility created the CSV file, in YYYYMMDD format.
- Client ID
- The IBM Digital Analytics client ID.
- Category ID
- The category identifier.
- Category Name
- The name of the category.
- Parent Category ID
- The category identifier of the parent category.
Configuration files for the data extraction utility
The IBM Product Recommendations data extraction utility uses three types of configuration files. Samples are provided, but you must update the samples with configuration information specific to your site and environment. These configuration files are based on the Data Load utility configuration files, but they include some extensions.- wc-dataextract.xml
- This file is the main configuration file that you must point to when you run the utility. This file specifies the paths to the environment configuration file and to the business object configuration file.
- wc-dataextract-env.xml
- This file is the environment configuration file. You must configure the language of the store and the currency for the price data before you run the utility.
- wc-dataextract-business_object.xml
- This file is the business object configuration file. For this utility, you need two versions of
this file:
- wc-dataextract-catalog-entry.xml: This business object configuration file is used to extract catalog entry data for the EPCMF file.
- wc-dataextract-catalog-group.xml: This business object configuration file is used to extract category data for the ECDF file.
- Business context information.
- Data mappings that are required to transform HCL Commerce business objects to the data that is written to columns in the EPCMF or ECDF file. The EPCMF file supports up to 15 customer-defined static attributes for catalog entries.
- Definitions for the order that the utility writes the data to the columns in the file.
- Pointers to interfaces and implementation classes that the utility uses.
Using the utility in different environments
The data extraction utility for IBM Product Recommendations can be run in the staging and production environments. However, you are recommended to run the utility in an environment that has all of the information that is required. For example, the staging environment might not have inventory or pricing information. In this case, run the utility on the production environment.
You can generate the CSV files in your staging environment to load into your IBM Product Recommendations test environment. You can also generate the CSV files in your production environment to load into your IBM Product Recommendations production environment. The utility is not intended to be run in the development environment. Support is provided in the development environment with a Derby database for customization purposes only. For example, when you are testing changes to the business object configuration file to include custom catalog entry attributes for the EPCMF file.