Data Load best practices for Catalog
The following best practices are recommended when you use the Data Load utility to load catalog data.
General
Ensure that you load catalog, category, and catalog entry objects into the store that owns the objects.When you use the Data Load utility to load catalog, category, or catalog entry, ensure that you specify the store that owns these objects. Specify this ownership in the data load order configuration file. When the Data Load utility creates, replaces or deletes an object, the store owner identifier is used to resolve the object identity. Then, the Data Load utility runs the requested operation.
Catalog
<_config:property name="initAttributeDictionary" value="true" />
Catalog Entry
Use the right mediator for the purpose. The CatalogEntryMediator is general enough to load catalog entries, their descriptions, relationships, prices, calculation codes, attributes, SEO data, and other related information within one input file. However, if you are loading only a particular piece of information, say description, then use the CatalogEntryDescriptionMediator for it specifically. The specific mediators are more focused and efficient.
Use the Data Load utility in update mode to load minor changes for catalog entries and catalog entry descriptions.You can use the Data Load utility in update mode for loading only catalog entry or catalog entry description information. In update mode, the utility compares the catalog entry data in the input file with the corresponding data for the catalog entries in the database. The update mode then replaces or adds data for only the columns that are specified in the input file. All other columns remain unchanged. For more information about configuring and running the Data Load utility in update mode, see Scenario: Catalog entry update load.
loadSEO
parameter with a value set to be "true"
. This
parameter must be set within the data load order configuration file with the following
format:<_config:property name="loadSEO" value="true"/>
When you run the
utility to load SEO information, you must include the Dinstance parameter and
value to identify your instance. Ensure that you have the files for the instance that you specify in
the WC_installdir/instances/instance_name directory and that you have access to
the files. When SEO is first enabled for a store, the seourlkeywordgen utility is used
to generate SEO URLs and keywords for your store. However, after this initial generation of SEO URLs
and keywords, use the Data Load utility with the loadSEO
parameter enabled to load
SEO URL information instead of reusing the seourlkeywordgen utility. To load the SEO-related
information when you are loading catalog entry data with the Data Load utility, your input files
must include the SEO information along with the catalog data. For more information about structuring
your catalog entry input files to include SEO information, see CatalogEntrySEO. invalidURLCharactersList
property in the infrastructure component
configuration (wc-admin-component.xml) file. For more information about this
property, see Configuration properties in the infrastructure component.Use the Data Load utility to create parent product before you load the child SKU.You can use the catalog entry load to create the parent product while you create the child SKUs. To improve performance, it is best to use data load to load the specific parent product before you load the child SKUs. By loading the parent product before you load the child SKUs, you can skip the task of caching the parent products and related attributes.
PartNumber,ParentGroupIdentifier,Sequence,Delete
Test-PN-10001,Accessory,2,1
Test-PN-10001,Pants,1,0
The second row deletes the old relationship and the third row adds the new parent-child
relationship. The data loader configuration for this
scenario:<_config:mapping xpath="CatalogEntryIdentifier/ExternalIdentifier/PartNumber" value="PartNumber" />
<_config:mapping xpath="ParentCatalogGroupIdentifier/ExternalIdentifier/GroupIdentifier" value="ParentGroupIdentifier" />
Specify the relationship types when you load the child catalog entries to bundle or kit relationships. When you load the child catalog entries to bundle or kit relationships, the type of relationship is an optional field in the input file. However, you can provide this field in the input file to optimize data load performance. If this field is not provided, the Data Load utility retrieves the catalog entry type from the database. Depending on the catalog entry type, the corresponding relationship type is created.
Use the mark for delete option when you delete a catalog entry. The default behavior for deleting a catalog entry is to mark for delete a catalog entry. This behavior means that the mark for delete flag of this catalog entry in the database is set as '1'. The catalog entry is not physically deleted from the database. Although you can change this default delete option to physically delete a catalog entry, you are recommended to use the default mark for delete option. This default mark is to ensure that any order items that refer to this deleted catalog entry is not removed as a result of the database cascade delete.
Do not load catalog entries into an extended site store with part numbers that duplicate inherited catalog entry part numbers.When you are loading new catalog entries into an extended sites store, your new catalog entries can have part numbers that are duplicates of the part numbers for inherited catalog entries. An extended site store catalog entry and a catalog asset store catalog entry can have the same part number because the catalog entries belong to different stores. If duplicate part numbers exist, store functions that retrieve catalog entries by only the part number can behave unexpectedly or can result in an error. For example, if a store function uses the part number to retrieve only a single catalog entry and instead finds two catalog entries with the same part number, an error can occur. Ensure that the part numbers for the catalog entries that you are loading do not exist for any inherited catalog entries. If your extended site store does include duplicate part numbers, you can use the Catalogs tool to change the part numbers for your extended site store catalog entries.
Delete the catalog entry to delete UserData. To remove UserData, delete the entire catalog entry that contains the UserData. You can also load blank fields to the UserData tables.
Category (catalog group)
GroupIdentifier,ParentGroupIdentifier,Sequence,Delete
Accessory,Womens Fashions,2,1
Accessory,Mens Fashions,3,0
The
data loader configuration for this
scenario:<_config:mapping xpath="CatalogGroupIdentifier/ExternalIdentifier/GroupIdentifier" value="GroupIdentifier" />
<_config:mapping xpath="topCatalogGroup" value="TopGroup" />
<_config:mapping xpath="ParentCatalogGroupIdentifier/ExternalIdentifier/GroupIdentifier" value="ParentGroupIdentifier" />
<_config:mapping xpath="displaySequence" value="Sequence" />
<_config:mapping xpath="" value="Delete" deleteValue="1"/>
loadSEO
parameter with a value set to be "true"
. This parameter
must be set within the data load order configuration file with the following
format:<_config:property name="loadSEO" value="true"/>
When you run the
utility to load SEO information, you must include the Dinstance parameter and
value to identify your instance. When SEO is first enabled for a store, the
seourlkeywordgen utility is used to generate SEO URLs and keywords for your store. However, after
this initial generation of SEO URLs and keywords, use the Data Load utility with the
loadSEO
parameter enabled to load SEO URL information instead of reusing the
seourlkeywordgen utility. To load the SEO-related information when you are loading category data
with the Data Load utility, your input files must include the SEO information along with the catalog
data. For more information about structuring your category input files to include SEO information,
see CatalogGroupSEO.invalidURLCharactersList
property in the infrastructure component configuration
(wc-admin-component.xml) file. For more information about this property, see
Configuration properties in the infrastructure component.When you
load category data and the utility generates an SEO URL keyword, the utility can generate a
different SEO URL keyword if a duplicate keyword is encountered. When the utility generates an SEO
URL keyword for a category, the utility first uses the category name as the SEO URL keyword. If the
keyword is already used by another category, the utility generates a different keyword with the
category name and identifier. If that keyword is still not unique, the utility then generates a
keyword with the category name, identifier, and language ID. For example, if you are
loading data for a category "Shirts"
, the utility first attempts to generate the
SEO keyword "Shirts"
. If another category already uses this keyword, the utility
then attempts to generate a keyword that also includes the category identifier, such as
10001
. If this alternate keyword, "Shirts10001"
, is also used by
another category, the utility then includes the language ID, "-1"
, to generate the
keyword, "Shirts10001-1"
. For more information about generating SEO
URL keywords when duplicate keywords exist, see Creating descriptive storefront URLs when duplicate keywords exist.
Load data for multiple languages in separate files. Even though data load supports loading categories with multi-language descriptions in one input file, it is recommended to load each language in its own input file. This recommendation is because the input file for different languages can use the different encoding setting, and it is easy to manage. For example, double byte languages like Chinese have different encoding settings.
Do not use the Data Load utility to load category hierarchy changes for linked categories.The Data Load utility, and the Catalog Upload feature in Management Center, cannot handle loading hierarchy data for linked categories. If you use the Data Load utility or Catalog Upload, to change the hierarchy for a category, the load process does not synchronize the data for any linked categories. For example, if you delete a child category or change the parent category, the changes are not reflected in any linked categories. The linked categories continue to have the original hierarchy. If the linked categories are not updated with the changes separately, you can encounter errors when you browse the categories on your store. You can use the Data Load utility or Catalog Upload to add or remove catalog entries from a category. When you load changes to the catalog entry assignments for a category, the load process does synchronize the changes across any linked categories.
Attribute
Use the attribute dictionary for attributes.The attribute dictionary facilitates data sharing and takes up less number of rows in the database. For more information, see Attribute dictionary. When you use the attribute dictionary, you are recommended to follow the best practices. For more information, see Best practices for using the attribute dictionary.
Load attributes and allowed values together.Attributes and allowed values can be loaded together or separately. For simplicity and manageability, it is recommended to load them together in one input file. The separate attribute value load is intended for more granular loading. Use this separate load when you must load details (such as Field1, Field2, Field3, Image1, Image2) for each individual attribute value.
Load catalog entry and attribute relationships by loading the SKU and attribute value relationship.After you load the attributes and allowed values into the attribute dictionary, load the relationship between the product SKU item and the attribute value directly. The relationship between product and attribute is automatically handled by the data load mediator.
Configure the Data Load utility to reuse assigned values.You can enable the Data Load utility to share attribute assigned values when the same value is needed for multiple catalog entries. By sharing attribute assigned values across catalog entries, you can reduce the number of duplicate values that the utility creates in the database. Reducing the number of duplicate values can improve the performance of retrieving attribute information from the database. When you enable the utility to reuse assigned values, the utility creates only the first instance of a value that included in the input file. The utility then reuses that value for all other instances where the utility is loading the same value for an attribute. For more information about enabling the reuse of assigned attribute values, see Reuse attribute assigned values with the Data Load utility.