Configuring the Data Extract utilityorder configuration file

Create a data extract order configuration file for the Data Extract utility to use to identify the business objects to extract and the sequence to extract the data. This file must also identify the other configuration files for configuring the utility and identify the CSV or XML output files that the utility is to generate.

Procedure

  1. Go to the following directory, which contains the sample configuration files for extracting data:
    • LinuxAIXWC_installdir/samples/DataExtract
    • WindowsWC_installdir\samples\DataExtract
    • WebSphere Commerce DeveloperWCDE_installdir\samples\DataExtract
  2. Create a backup of the wc-extract-business-object.xml configuration file or files in the directory or subdirectories for the object or objects that you want to extract, where business-object is the name of the business object that you are extracting.
  3. Open the data extract order configuration file (wc-dataextract-object.xml) for the object that you want to extract for editing.
  4. Configure the <_config:DataLoadEnvironment> element to set the value of the configFile attribute to identify the environment configuration file. If the file is not in the same directory as the main configuration file, include the relative path to environment configuration file.
  5. Configure the <_config:LoadOrder> element to set the value for any attributes that you want to apply to the extraction process for all load items.
    For example, you can include one of the following configurable name-value pair properties to configure the structure of any generated CSV output files.
    firstTwoLinesAreHeader
    Configures the generated CSV output files to include two lines of header information. The first line includes the keyword for the type of business object that is included in the file. The second line includes the column headings. You can include the following values for this property:
    true
    The CSV files include two lines of header information.
    false
    The CSV files do not include two lines of header information. This value is the default value.
    firstLineIsHeader
    Configures the generated CSV output files to include the column heading as a line of header information. You can include the following values for this property:
    true
    The CSV files include the column headings as a header line.
    false
    The CSV files do not include a line of header information. This value is the default value.
    If you do not include either property or set both properties to false, the generated CSV output file does not include any header information. The files include only the data records. If you include both properties set to true, the generated CSV output files include two lines of header information.
    Note: Since the Data Extract utility uses the existing Data Load utility framework, you can include the commitCount and batchSize attributes. Since the Data Extract utility does not insert any data into the database, any setting for these attributes does not affect the extract process. If you do include these attributes, set the value of commitCount to be 0 and the value of batchSize to be 0.
  6. Configure the outputDirectory variable for use in specifying the directory where the utility is to generate the configured output files.
    This configurable variable is used within the location attribute for the <_config:DataOutputLocation> element of each load item to help quickly define the output file location. By default this directory is named output and is in the same directory as the sample data extract order configuration file. If you want the name of the directory or the location, edit the following variable configuration in your sample file:
    <_config:Variable name="outputDirectory" value"output" />
    If you want to change the location of the directory, include the relative path from the order configuration file directory or the absolute path of the directory where you want the output files and any output directories to generate.
  7. Configure each <_config:LoadItem> element.
    Ensure that the value for the name attribute identifies the object that you are extracting data about. The value for each of the businessObjectConfigFile attributes must identify the correct business object configuration file that the utility needs to use to transform the object data. If the file is not in the same directory as the main configuration file, include the relative path to business object configuration file.
  8. In the <_config:DataOutputLocation> element for each load item, change the value of the location attribute to identify the file name for the output file to be generated.
    If you want this file to generate in a different subdirectory in the configured output directory structure, include the relative path to the destination where you want the file to generate. This relative path is from the configured output directory.

    If you want the utility to generate XML output files instead of the default CSV files, specify the output file name with the .xml file name extension. You must also edit the business object configuration file so the utility uses the XML data writer.

  9. Save and close the configuration file.

What to do next

Run the Data Extract utility command. For more information about the parameters that you can configure when you run the utility from a command-line utility, see Data Extract utility.