webFeedLoad best practices
When running the webFeedLoad utility follow these best practices.
- Processing large feeds
If you are processing more than 1000 entries in a feed where the feed content is stored as managed files; ensure that the EAR updater job, ScheduledContentManagedFileEARUpdateCmdImpl, is stopped before the feed retriever scheduled job, FeedDataloadSchedulerCmd, runs.
- Changing data load configuration filesThe feed retriever generates a set of data load configuration files, to tune these generated files for performance or other reasons.
- Run the webFeedLoad utility with the parameter -DGenerateDataLoadConfigOnly=true. This option generates the data load configuration files, but, does not process the feed.
- Ensure that you follow the best practices described for Data Load configuration files when you modify generated files.
- If the feed configuration has not changed, set the parameter -DGenerateDataLoadConfigOnly parameter to false for subsequent runs of feed retrieval.
- Delta feed retrieval and processing If your database is large, set the ID resolver cache size to 0 for small delta loads. For example, in the wc-dataload-env.xml file specify the ID resolver cache size to 0:
<_config:IDResolver className="com.ibm.commerce.foundation.dataload.idresolve.IDResolverImpl" cacheSize="0" />
- Running the feed retrieval
Ensure that the FeedDataloadSchedulerCmd and the webFeedLoad utility are not run simultaneously. Ensure that the batch script finishes before the scheduled job for FeedDataloadSchedulerCmd begins.