Recovering a corrupted search index after an unexpected failure
You can recover a corrupted, failed, damaged, or inaccessible search index if an unexpected hardware or software failure occurs.
- An indexing task was interrupted, causing a partially built index to be replicated to production.
- A loss of network connectivity during index replication, causing one or more of the index data segment files to be corrupted.
- Running out of file descriptors or storage capacity during indexing or replication, causing the entire index to be deleted.
To determine what to back up and restore in an event of an index corruption, you must understand which components are involved and how data flows through each component. A repeater is typically used in a production environment to perform indexing, where catalog data is read directly from the production database. After the repeater finishes indexing, members from the search subordinate server cluster pull the changes from the repeater and locally replicate the update on each subordinate server. A local copy of the index exists on each subordinate server in the solrhome location. The Solr home contains all the search server-related configuration files, indexing scripts, and index data files. Therefore, it is important that your backup strategy copies everything under the Solr home of the repeater. That is, the master copy of the index in production.
If any index issues occur in the production system, the corrupted index can either be restored from a backup or rebuilt. It is recommended that you set up a recurring task to copy your indexes to an alternative storage location at a regular time interval, or right before each reindexing occurs. This practice dramatically reduces index restoration time. For example, if an index becomes corrupted, you can restore the most recently known working version of the search index onto the repeater and set it as the current index. The subordinate servers will automatically be able to synchronize themselves with the restored index version.
In general, it is strongly recommended to keep at least one copy of the most recent working search index. This backup copy should be kept current and refreshed each time a change is made to the search index. Then, in an event of an index failure or corruption, restoring from the recent backup is quick and effective to bring the site back up and running. An index backup is simply a copy of the index data files on the file system. The best time to create a backup is right after the final indexing is complete, and a quick sanity and integrity test has been performed. It is optional but beneficial to keep multiple successive backups, so that you have more flexibility when rolling back to a prior version of the search index.
To restore the search index to a prior version, complete the following task: Backing up a WebSphere Commerce search index. Although the indexed data from the selected backup might be slightly out of date, the site can be brought back up with minimal down time. While the site is up and running again on the restored backup, you can investigate the root cause and perform additional recovery plans, such as retrying to build the corrupted search index on the same or another indexing server.