Example: Sample schema for Catalog attachments
The following sample snippet contains information for Catalog attachments in WebSphere Commerce.
<fields>
<!--
Attachments' basic attributes:
-->
<field name="attachmentrel_id" type="string"* indexed="true" stored="true" required="true" multiValued="false"/>
<field name="attachment_id" type="long" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="catentry_id" type="long" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="name" type="wc_text" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="path" type="wc_text" indexed="false" stored="true" required="false" multiValued="false"/>
<field name="mimetype" type="wc_text" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="image" type="wc_text" indexed="false" stored="true" required="false" multiValued="false"/>
<field name="rulename" type="wc_text" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="identifier" type="wc_text" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="shortdesc" type="wc_text" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="longdesc" type="wc_text" indexed="true" stored="true" required="false" multiValued="false"/>
<!--
Tika's default dynamic field: map to all metadata of Tika generated fields
-->
<dynamicField name="tika_*" type="wc_text" indexed="true" stored="true" multiValued="true"/>
<!--
Spell check field
-->
<field name="spellCheck" type="wc_textSpell" indexed="true" stored="false" multiValued="true" />
</fields>
* type="long"
Where: The value of the following fields
are obtained from the WebSphere Commerce database:
And the values of the following field is obtained from the
output of the Tika framework:- attachmentrel_id
The schema change from
type="long"
totype="string"
enables greater flexibility of the index to store more types of content. The extra Web content is added in without any runtime functionality impact. Upgrading existing schemas requires a full index of previous unstructured content while deploying the new schema. To distinguish the HTML content with previous attachment content, the value of attachmentrel_id contains a prefixHTML_
and the value of both attachment_id and catentry_id are -1. - attachment_id
- catentry_id
- name
- path
- mimetype
- image
- rulename
- identifier
- shortdesc
- longdesc
- tika_*
The data type of the dynamic field and the other fields use
the same wc_text data type introduced in the structured content index
schema. The unstructured content Solr core is deployed under the related
entity core folder, and reuses the stopwords.txt, synonyms.txt and
protwords.txt files of its parent entity core configuration. For example:
<fieldType name="wc_text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="../../conf/stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="../../conf/protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="../../conf/synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="../../conf/stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="../../conf/protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
Where each language contains its own set of data and configurations
for its content.