Customizable components of the final Solr query
You can customize components of the final Solr query by using query parsers. A query parser is a component responsible for parsing the textual query and converting it into corresponding Lucene Query objects.
Disjunction Max Query (Dismax), Dis-joint text (multiple fields), and Maximum match (score), searches for every pair of the field or term separately. Then, it combines the results to calculate the maximum score value of the results.
ExtendedDisMax adds the following features on top of Dismax:
Minimum match
ANY
search type is used, where you
can specify how many of the search keywords must match the indexed documents. A number denotes the
number of query keywords to match. A number that is formulated with a percentage denotes that a
percentage of the query keywords must match. For example:- 1 denotes that at least one query keyword must match.
- 2<80% 6<50% denotes that when there are fewer than 3 keywords, both of
the keywords must be found in the document. When 3 - 6 keywords occur, 80% of the keywords must be
found in the document. When there are more than 6 keywords, 50% of the keywords must be found in the
document.
For example, if a shopper searches for 3 keywords, 80% of the 3 keywords equals 2.4. Rounded down, results that match at least 2 of the 3 entered keywords are returned.
Important: You must use the correct character encoding when you enter percentage values in a file. For example:- In a JSP fragment file, such as SearchSetup.jspf, the preceding percentage value is entered as is: 2<80% 6<50%.
- In the wc-component.xml file, such as in this case,
wc-component.xml, the preceding percentage value is entered as:
2<80% 6<50%
.
ANY
search type is used:Search type | Search results |
---|---|
ANY |
Searches for red dress return the following results: red dress, red potato, red fish, dress shoes, dress shirt, dress belt. |
ALL or EXACT |
Products with indexed searchable fields that contain red
and dress are returned, but not the blue summer dress. The
red floral dress is returned for the ALL search type, but not
for the EXACT search type, as it is not an exact match. |
- If you do not require minimum match, but do require synonyms: Set up your synonyms by using search term associations in the Management Center. This flow allows each store or extended site to use their own synonyms list.
- If you require both minimum match and synonyms, and you have only one store per master catalog, or all of your stores within the same master catalog share synonyms list, and do not require multiword synonym or replacement terms: Complete the following task: Combining minimum match with search term associations (using the Solr expansion algorithm).
Phrase fields
pf=name^10.0 defaultSearch^1.0 categoryname^100.0 shortDescription^5.0 partNumber_ntk^15.0
Phrase slop
Phrase slop (ps) specifies how far apart the indexed search terms are in the document to influence relevancy. For example, searches for sports movie with ps = 1 results in sports movie being more relevant than sports is the type of a movie. Phrase slop defines the amount of slop on phrase queries that are built for phrase fields (pf). For more information, see Tuning multiple-word search result relevancy by using minimum match and phrase slop.
- Defined in the URL.
- Defined in the search profile.
- Defined in the catalog component configuration file (wc-component.xml) on the Search EAR.
Phrase bigram fields
Phrase bigram fields (Pf2) break down the input into bi-grams. For example, the brown fox
jumped
is queried as the brown
brown fox
and fox jumped
. Therefore, if your search phrase is
red hat black jacket
, you can use ps=0 with
pf2 to ensure that products that contain red hats
are boosted
before black hats
.
Phrase trigram fields
Phrase trigram fields (Pf3) break down the input into tri-grams. For example, the brown
fox jumped
is queried as the brown fox
brown fox jumped
.
Tie breaker
(score of matching clause with the highest score) + ( (tie parameter) * (scores of any other matching clauses) )
The tie parameter configures how the final score of the query is influenced by the scores of the lower scoring fields, which are compared to the highest scoring field.
The tie value is set to 0.1 by default. A value of 0.0 makes the query a pure disjunction max query, where only the maximum scoring subquery contributes to the final score. A value of 1.0 makes the query a pure disjunction sum query, where the maximum scored subquery is irrelevant, and the final score is the sum of the sub scores. A low value, for example, 0.1, is typically useful. For more information about changing the tie value, see Adding native Solr query parameters to search expressions.
Filter query
Filter query (fq) specifies a query that can be used to restrict the documents that can be returned, without influencing the score. This can be useful for speeding up complex queries, since the queries that are specified with fq are cached independently from the main query.
Query boost
Boosts can be performed both at index-time or query-time: Index-time boosts are applied when adding documents, and apply to the entire document or to specific fields. Query-time boosts are applied when constructing a search query, and apply to specific fields.
^
followed by a
positive number to query clauses. For example:
title:sail OR (title:sail AND title:boat)^2.0 OR title: "sail boat"^10
(*:* -title:foo)^2.0
Where all documents that do not contain foo
in the title are boosted by
2.0.
Solr provides another way of boosting documents by using function queries. FunctionQuery allows you to use the actual value of a field and functions of those fields in a relevancy score. For more information, see Solr Wiki: FunctionQuery.
Examples
- You can change the sequence of returned results by specifying a different sort sequence. For
example:
sort=socre desc, price asc
. - You can add a filter (fl) parameter to the Solr query. The fl parameter restricts fields to be
returned with the result set. For example:
fl=id, title, text
restricts only searches in the id, title, and text fields. - You can modify the list of fields that are included in the
<_config:result>
section of the search profile within the wc-search.xml file.For example, for the IBM_findProductsBySearchTerm search profile:You can change the existing values of these parameters in the wc-search.xml file. You can create a new preprocessor to add a component to be used in the final Solr query. For more information, see Creating a custom query preprocessor.<_config:result> <_config:field name="catentry_id"/> <_config:field name="storeent_id"/> <_config:field name="buyable"/> <_config:field name="partNumber_ntk"/> <_config:field name="name"/> <_config:field name="shortDescription"/> <_config:field name="thumbnail"/> <_config:field name="keyword"/> <_config:field name="mfName_ntk"/> <_config:field name="catenttype_id_ntk_cs"/> <_config:field name="price_*"/> <_config:field name="listprice_*"/> <_config:field name="parentCatgroup_id_facet"/> <_config:field name="childCatentry_id"/> <_config:field name="mfPartNumber_ntk"/> <_config:field name="ad_attribute"/> <_config:field name="isDKPreConfigured"/> <_config:field name="dkModelReference"/> <_config:field name="dkURL"/> <_config:field name="dkDefaultConfiguration"/> <_config:field name="parentDKModelRef"/> <_config:field name="dkConfigurable"/> <_config:field name="parentDKConfigurable"/> <_config:field name="startdate"/> ; <_config:field name="enddate"/> </_config:result>