Index Load configuration files for indexing from CSV files
Loading the index from a CSV file
Follow these steps to load index information from a CSV file.
- Edit the wc-indexload-profileName.xml configuration file, and add the CSV file location, and the target core name.
- Specify CSVReader as the reader, and SolrIndexLoadMapObjectBuilder as the business object builder in wc-businessObject-profile.xml.
Index Load configuration file | Data Load definition file |
---|---|
Environment configuration file (wc-indexload-env.xml) | wc-dataload.xsd |
Profile configuration file (wc-indexload-profileName.xml) | wc-dataload-env.xsd |
Profile item configuration file (wc-indexload-businessobject.xml) | wc-dataload-businessobject.xsd |
Environment configuration file (wc-indexload-env.xml)
The wc-indexload-env.xml file contains environment control information and global properties that are required by Index Load, including a common data writer and data source to be used to persist the data.
The wc-indexload-env.xml file does not typically require customization. You can use the default sample file as-is.
Profile configuration file (wc-indexload-profileName.xml)
The wc-indexload-profileName.xml file contains configurable performance attributes and load item configurations.
Profile names that you define in configuration files are then substituted in as a URL parameter when you call Index Load in a web browser.
The load item configurations are listed under the load order section of this file. They are processed in the same order as they are specified.
It can contain one or multiple LoadItem definitions, with every LoadItem configuration specifying the specific LoadItem configuration and coreName target. Multiple LoadItems are run in parallel, without sequence.
- batchSize
- The threshold when documents are soft committed in memory.
- commitCount
- The threshold when documents are hard committed to disk from memory.
- ThreadLaunchTimeDelay
- The amount of time in milliseconds to wait before starting another new thread to avoid overloading the system at startup.
- OptimizeAfterIndexing
- Indicates whether Index Load performs index optimization after commit.Note: Performing optimization after a full indexing improves runtime performance; however, it increases the overall indexing time.
- StatusRefreshInterval
- The maximum amount of time in seconds to wait before refreshing the current Index Load status and display it in the administrative log.
- DocumentSizeSamplingInterval
- The time interval in seconds to calculate the size of the indexed document. Use -1 to disable the service. The default value is 300.
- IndexHeightCacheHint
- A number that hints the system to determine the size of the applicable caches for index height that is used during indexing.
- IndexWidthCacheHint
- A number that hints the system to determine the size of the applicable caches for index width that is used during indexing.
Profile item configuration file (wc-indexload-external-price.xml)
<_config:LoadItem name="ExternalPrice-1" businessObjectConfigFile="wc-indexload-external-price.xml">
<_config:property name="coreName" value="MC_10001_CatalogEntry_Price_generic" />
<_config:DataSourceLocation location="C:\Patches\delta.csv" />
</_config:LoadItem>
Where- coreName
- The name of the extension core name.
- DataSourceLocation
- The location to the CSV data file.
Structure of the CSV file
3074453050778347689,786.15,281.96,403.89
3074453050778348765,858.38,165.91,353.82
Sample configuration files
Download and extract the following sample code: IndexLoadSampleCode.zip. The sample includes configuration files that are used by Index Load, and manual updates that are performed in the Indexing contract prices using Index Load task, for reference.