Configuring the XML data reader
Configure the extensible markup language (XML) data reader in the business object configuration file to modify the way that data is read from XML formatted source files. You might want to change the default settings of the XML data reader to better work with the format of your data.
The data that is read from an XML file can be mapped to a WebSphere Commerce business object by using a business object configuration file. Using the configuration file, each element of data in the input XML file can be mapped directly to a property of a WebSphere Commerce business object. This handler reads and creates a name-value pair (NVP) mapping one record at a time and then passes each mapping to a business object builder.
Procedure
-
Locate the wc-loader-<object>.xml business object
configuration file for the business object type that you are loading. Open the configuration file
for editing.
Sample business object configuration files are in the following directory:
- WC_installdir/samples/DataLoad/Catalog
- WCDE_installdir/samples/DataLoad/Catalog
-
Find the data reader configuration element:
<_config:DataReader className="com.ibm.commerce.foundation.dataload.datareader.XmlReader" > </_config:DataReader>
- Optional:
Add the handler classes within the data reader configuration element to change how the Data
Load utility handles loading your XML data.
To add an XML handler class, you must specify the class in the following format
<_config:XMLHandler className=""/>
. For example, the following configuration adds theNVPXmlHandler
XML handler class into the data reader configuration:<_config:DataReader className="com.ibm.commerce.foundation.dataload.datareader.XmlReader" > <_config:XmlHandler className="com.ibm.commerce.foundation.dataload.xmlhandler.NVPXmlHandler" /> </_config:DataReader>
The following class is available by default:
- NVPXmlHandler
- This handler class is the default handler for the XmlReader and is used to
handle generic XML data that follows a specific CSV-like file format. This handler reads each
second-level element as a separate object record. This handler parses your input file one object
record at a time and generates a hash map for each record that is then passed to the business object
builder. The key of this map is the element or attribute name for the objects that you are loading
of a particular business object type. You can modify this default behavior by specifying the
following parameters: xpathEnabled, qualifiedName, and nvpReMapping. See the detailed descriptions
of how to use these optional properties in the following step.
If you do not specify an XML handler in your data reader configuration, this handler is used. All data load configuration files that are used for loading CSV input files can be used to load XML input files. The Data Load framework switches the data reader that is used automatically depending on the file type, either CSV or XML, of the input file.
- Optional:
Add configuration properties within the data reader configuration element meet your data
loading requirements.
To add a configuration property, you must specify the property in the following formatThe following optional properties are available for use with the default NVPXmlHandler class:
<_config:property name="" value""/>
. For example, the following configuration adds therecordXpath
configuration property for a catalog entry into the data reader configuration:<_config:DataReader className="com.ibm.commerce.foundation.dataload.datareader.XmlReader" > <_config:property name="recordXpath" value="CatalogEntry" /> </_config:DataReader>
- recordXpath
- If your input file has the object element nested deeply, you can set the XPath to have the
handler start reading the nested object element as the root element. When you specify this property,
any XML element that has a value that matches the XPath value of this property is handled by the
Data Load utility as a separate record. If you do not specify this property, only the second-level
XML elements are handled as individual object records.
Specify the value for this parameter to be an XPath. The XPath can be absolute XPath or relative XPath. An XPath is an absolute XPath if it starts with the forward slash /. The relative XPath is just a single element name. For example, you can specify the following absolute XPath:
or the following relative XPath:<_config:property name="recordXpath" value="/Object/ObjectType/CatalogEntry" />
This XPath ensures that the object<_config:property name="recordXpath" value="CatalogEntry" />
<CatalogEntry>
in the following sample is read as the record element:
The other elements,<Object> <ObjectType> <CatalogEntry> <PartNumber>productPartNumber-1</PartNumber> </CatalogEntry> </ObjectType> <Object>
<Object>
, and<ObjectType>
are ignored. - xpathEnabled
- If your element names are not unique, you can use this property to use the XPath to create
uniqueness in the NVP pair mapping. If you specify this property with a value of true, the key for
mapping your data during the Data Load process uses the XPath to the element. If this value is
false, the key for mapping your data is the element name or attribute name. The XPath that is used
is relative to your element record. The default value for this property is false. Note: If you set this property as true, you must also change the value for the mapping of your object in the data load business object configuration file.For example, if your input file contains the following catalog entry element:
If you set the xpathEnable to be true, the XML handler builds the following mapping:<CatalogEntry catalogEntryTypeCode="ProductBean" displaySequence="1.0"> <PartNumber>productPartNumber-1<PartNumber> <Description> <Name>name-1<Name> <Description> </CatalogEntry>
The keys in the mapping are the XPath which always relative the root of your record element CatalogEntry without starting with the forward slash /. The attribute is treated like an element in the XPath.catalogEntryTypeCode = ProductBean displaySequence = 1.0 PartNumber = productPartNumber-1 Description/Name = name-1
- nvpReMapping
- This property controls how to redo the NVP mapping of your data that is passed for an object
record to the business object builder. The value of this property defines a list of remapping rules
for your data. If the elements that contain information for your object contain names that are not
unique, you can use this configuration property to ensure uniqueness. For example, within a catalog
entry, object elements for the catalog entry
<name>shirt</name>
and attribute<name>color</name>
can exist. The XML handler reads the values for these elements as two values for a singlename
element and records these values as list in the NVP mapping,name=[shirt, color]
. By remapping the XPath for these elements, you can ensure that the handler reads and maps these elements and values correctly.Your list of NVP remapping rules must have each rule separated by a '|' character. Each rule contains three tokens that are separated by a comma ',' character. The first token is for the new key in the remapping. The second token is for the new value in the remapping, and the third token is for the prefix in the remapping key.
For example, if your input file contains the following catalog entry elements:
The handler class, by default, reads the XPath for the following description elements<CatalogEntry> <CatalogEntryIdentifier> <ExternalIdentifier> <PartNumber>productPartNumber-1</PartNumber> </ExternalIdentifier> </CatalogEntryIdentifier> <Description> <Attributes name="auxDescription1">auxDesc1-1</Attributes> <Attributes name="auxDescription2">auxDesc2-1</Attributes> <Attributes name="published">1</Attributes> </Description>
The handler maps these elements as two elements:name=[auxDescription1, auxDescription2, published], Attributes=[auxDesc1-1, auxDesc1-2, 1]
If you set the remapping configuration property to be:name=[auxDescription1, auxDescription2, published] Attributes=[auxDesc1-1, auxDesc2-1, 1]
The handler reads the elements as three separate elements and maps these elements as<_config:property name="nvpReMapping" value="name, Attributes, " />
If you specify the remapping rule that contains the prefix:auxDescription1 = auxDesc1-1 auxDescription2 = auxDesc2-1 published = 1
These elements are read and mapped as<_config:property name="nvpReMapping" value="name, Attributes, Description/Attributes/name/" />
Description/Attributes/name/auxDescription1 = auxDesc1-1 Description/Attributes/name/auxDescription2 = auxDesc2-1 Description/Attributes/name/published = 1
Note: If you do change the NVP mapping for an object, you must also change the value for the mapping of your object in the data load business object configuration file. For example, to map this data to use the remapping rules, your business object configuration mapping can be:
The value prefix<_config:mapping xpath="Description/Attributes/auxDescription1" value="Description/Attributes/name/auxDescription1" /> <_config:mapping xpath="Description/Attributes/auxDescription2" value="Description/Attributes/name/auxDescription2" /> <_config:mapping xpath="Description/Attributes/published" value="Description/Attributes/name/published" />
Description/Attributes/name
is optional, if you do not use the prefix, your mapping can resemble:<_config:mapping xpath="Description/Attributes/auxDescription1" value="auxDescription1" /> <_config:mapping xpath="Description/Attributes/auxDescription2" value="auxDescription2" /> <_config:mapping xpath="Description/Attributes/published" value="published" />
- qualifiedName
- The qualified name is used to ensure the uniqueness of the data elements that you are loading.
This uniqueness is achieved by the inclusion of the namespace as part of the name for your element
data in the NVP pair mapping. Specify this property value as true to include the namespace as part
of the key to the map that is passed to your business object builder. The default value is
false.Note: If you set this property as true, you must also change the value for the mapping of your object in the data load business object configuration file.
Working with element and attribute values:You can use either elements or attributes to add data to be loaded. Typically, they are loaded the same by using either method. However, they are loaded differently when the value is empty.
By default, all elements with empty values are treated as null. However, attributes with empty values are treated as empty values. That is, the value is null in the database if you use an element for Name, and the value is empty in the database if you use an attribute for Name. This default behavior can be changed by using the following optional configuration properties.For more information, see Creating data in XML format.- ignoreEmptyElementText
- If set to false, empty elements are treated as empty values. The default value is true.
- ignoreEmptyAttributeValue
- If set it to true, empty attribute values are treated as null. The default value is false.
<DataReader>
element,<LoadItem>
element, or<LoadOrder>
element as:<_config:property name="ignoreEmptyElementText" value="false" />
- Optional:
Configure your Data Load process to include a data reader preprocess.
To configure a preprocessor to run, you must specify the preprocessor class in the following format:
For example, the following configuration specifies that a file difference preprocessor is to run:<_config:DataReaderPreprocessor className="" />
<_config:DataReader className="com.ibm.commerce.foundation.dataload.datareader.XmlReader" > <_config:DataReaderPreprocessor className="com.ibm.commerce.foundation.dataload.datareader.XmlFileDiffPreprocessor" /> </_config:DataReader>
The following data reader preprocessor is available for use with the Data Load utility:- com.ibm.commerce.foundation.dataload.datareader.XmlFileDiffPreprocessor
- This preprocessor compares a specified old and new input file and generates a new file that contains only the differences that exist in the new file. This preprocessor can improve the performance of routine Data Load operations by avoiding loading data that was loaded previously. For more information about this preprocessor, see Data Load file difference preprocessing. If you are running this preprocessor, you can also include more configuration properties specific to this preprocessor. For more information about configuring this preprocessor, and the configuration properties available for this preprocessor, see Configuring the Data Load utility to run a file difference preprocess.
- Save and close your file.