Enabling WebSphere Commerce Search as a stand-alone unstructured content search engine
You can enable WebSphere Commerce Search to exclusively search unstructured content so that your unstructured content can be more efficiently indexed and retrieved. Enabling a stand-alone search engine for unstructured content helps offset potential intensive processing loads when you search for two different content types.
Before you begin
- Deploying WebSphere Commerce Search
- Your database contains unstructured content.
Procedure
-
Design the schema for the new core.
Since the core is for unstructured content only, at least one dynamic text field is required for content mapping from the output of the Solr Cell.
For example, the following snippet is a sample schema for the new core:<!-- Tokenized text for search --> <fieldType name="wc_text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType> … <field name="unstructured_id" type="wc_text" indexed="true" stored="true" required="true" multiValued="false"/> … <dynamicField name="tika_*" type="wc_text" indexed="true" stored="true" multiValued="true"/> … <uniqueKey> unstructured _id</uniqueKey>
Where
unstructured_id
is the key field and helps identify the unstructured documents. -
Post the unstructured content to the core by selecting one of the following methods:
-
Perform a search by using the core.
After all the unstructured contented is posted to the Solr Cell and the contents are indexed, you can use a search URL to start searching.