Expanding synonyms and Search Term Associations at query time
In the Query service, Search Term Association (STA) and synonym expansion are performed by the Query service before the query is passed to the Elasticsearch engine.
Search term associations suggest additional, different, or replacement products in search results. They can also link search terms to a selected landing page in the store. Search term associations are used as a product recommendation strategy to increase store sales when customers search for products, as the search submission is modified to increase or target search results. For a detailed explanation of what they are and how they work, see Search term associations.
How expansion is performed
- No spell check on any search term defined in STA.
- Lemmatization is used to match input search terms with search term associations. Stemming will be performed on the original STA set for the primary search.
- All synonym-expanded terms are contained in a SHOULD clause. Natural Language Processing (NLP) analysis is not performed on expanded STAs, except replacement terms with a one-to-one relationship. Replace terms will go through the NLP processing. For more information on the expected behavior from this design decision, see Default behaviors after STA expansion.
- Processing is only performed in a single direction (from top to bottom):
- The first pass manages all replacement terms.
- The second pass is for synonym expansion.
Note that the overall combined search scope may be increased slightly when one or more of the original search terms are identified to be adjectives. This could result in more search hits for the remaining expanded search terms.
- Place the most significant terms at the beginning of the synonym expansion list. The Query service parser can then perform the appropriate boosting against those significant terms at runtime. The result is a more relevant result set that can be presented to the end user. This applies when long tail is enabled; otherwise, the input search term will be used for boosting.
- Use only the singular form in the synonym expansion list so that more accurate stemming can be applied at query time.
- List all the related synonyms terms in one line, do not repeat them in multiple lines.
- All the synonyms must be spelled correctly. Misspelled keywords can be added as replacement terms.
- Replacement terms should not span multiple lines. For example, the following
is an invalid
example:
The correct syntax isvision => vision, eyes eyes => eye
vision => vision, eyes, eye
- Synonym expansion is only performed in the forward direction.
- Avoid duplicate entries for search term associations and synonyms added through the Management Center or the /configuration endpoint.
- If no synonym expansion has been done, then the Query service does dependency parsing to get the root keyword out of the search term.
- NLP processing will not be performed on the synonyms or replacement
terms, except for replacement with a one-to-one relationship. For
example,
chair => sofa
. - In the case of one-to-one replacement, replace terms can be expanded with synonyms if applicable.
General guidelines for composing synonyms and replacement terms
- Exercise care when adding synonyms and replacement terms.
- The replacement terms associated with synonyms are designed to replace
the shopper's search terms when the shopper is interacting with the
storefront with terms that match the product data and return desired
results. Synonyms are intended to be used by merchandisers, to aggregate
and return similar products with different product data names/ terms
together in the search results. The search service can handle typical
spelling mistakes and most inflected forms of terms (-ing, -ed, plurals,
etc.). Use singular terms and do not add plural / inflected term forms
unless you find the search service is returning matched results. If you
are not sure whether the search service will properly match inflected
terms for your top searches, you can check term’s “stem”: https://snowballstem.org/demo.html
- Synonyms may be used for inflected terms used in product data that are not being “stemmed” down to a matching root form of the term. (Example: “conditioner, conditioning”, “shelf, shelves”)
- Replacements should be used where the shopper’s incoming inflected search term is not matching with non-inflected terms in the product data. (Example: “welder => weld”)
- Synonyms are "global" and adding a synonym to improve one search use case may impact relevancy for other searches. In order to minimize potential search relevancy issues, improve product data and add keywords instead of using synonyms. Improvements to product data isolates the change to specific products and may also provide Search Engine Optimization (SEO) (Google cannot crawl/ index synonym data) and Key Performance Indicator (KPI) benefits.
- Use only nouns for terms and avoid duplicate/ overlapping entries.
- Do not add misspelled forms of terms as synonyms. If you find commonly misspelled search terms in search analytics that the search service is having difficulty matching, use a replacement term entry.
- Use replacement terms for abbreviations unless abbreviations are used in the product data. Synonym entries can be used to aggregate results for similar products where the abbreviations are inconsistent. (Example: “recip, reciprocate”).
- The replacement terms associated with synonyms are designed to replace
the shopper's search terms when the shopper is interacting with the
storefront with terms that match the product data and return desired
results. Synonyms are intended to be used by merchandisers, to aggregate
and return similar products with different product data names/ terms
together in the search results. The search service can handle typical
spelling mistakes and most inflected forms of terms (-ing, -ed, plurals,
etc.). Use singular terms and do not add plural / inflected term forms
unless you find the search service is returning matched results. If you
are not sure whether the search service will properly match inflected
terms for your top searches, you can check term’s “stem”: https://snowballstem.org/demo.html
- Other guidelines when creating synonyms:
- Use either the Management Center Search Term Association (STA) tool or the query configuration REST API. Note that synonyms added using the synonyms API will not be visible in the Management Center STA tool.
- Replacement terms are processed before synonyms and synonym expansion is not performed on the replaced keywords.
- Keep synonym entries as short and as simple as possible. Use single-word
synonyms whenever possible to simplify similar multi-word terms. For
example, consider the following multi-word synonyms used for describing
a hoist for lifting a
vehicle.
Original:
Simplified:vehicle lift, vehicle hoist, car lift, car hoist, automobile lift vehicle ramp, car ramp, automobile ramp
automobile, car, vehicle hoist, lift
- Use consistent terms usage in the product data and synonyms and
replacement terms (For example:
bandsaw
versusband saw
,e-track
versus etrack, units of measure (eg., in, “, inch) in product names). - Place the most significant terms at the beginning of the synonyms entry when long tail search is enabled.
- Ensure that the number of synonyms in a given synonym entry does not exceed the SynonymExpansionThreshold (default of 20 terms) configuration setting.
Default behaviors after STA expansion
- In a case where you have added the synonyms
"style home chair, sofa, sofa set"
and a customer searches for 'sofa set' then the expanded query will not be processed through NLP parsing. The system performs a text search of all of these words against the catalog, using a query formed with grouping as follows:query : ("style" AND "home" AND "chair") OR ("sofa") OR ("sofa" AND "set") fields : [sta query fields]
- If you have added a replacement
"style home chair => sofa set"
with the replacement type set to"Also Search For"
(that is,also search for:
), then the expanded query will not be processed through NLP parsing. Instead, a text search of all of these words is performed against the catalog. The query will be formed with the group as shown below while searching for'style home chair'
:query : ("style" AND "home" AND "chair") OR ("sofa" AND "set") fields : [sta query fields]
- If you have added the replacement
"style home chair => sofa, sofa set"
with the replacement type set to"Instead Search For"
(that is,instead search for:
). Then the expanded query is not processed through NLP parsing. Instead, a text search of all of these words is performed against the catalog. The query is formed with the group shown below while searching for'style home chair'
:query : ("sofa") OR ("sofa" AND "set") fields : [sta query fields]
- If you added a replacement
"style home chair => sofa set"
with the replacement type set to"Instead Search For"
(instead search for:
, then the expanded query is processed through NLP parsing, because there is a one-to-one relation for replacement. The search is performed based on the NLP classification after parsing the replacement term through the NLP process while searching for'style home chair'
.query : ("sofa" "set") fields : [nlp classification query fields]
Sample Use Case A – difference between keywords and synonyms
Suppose there is a hoist
category with chain hoist products, and
another gardening category with garden hose products. A shopper searches for
chain hoists, and the result is that 486 products are
found. Later there is a requirement to associate the term link
to
chain
. But since there are over 400 products involved, the
merchandiser chooses to add a simple synonym link, chain
.
Next, another requirement arises, related to associating strap
to
link
for a small number of tie-down products. The merchandiser
was not aware that there are several products in the gardening category that have
"strap" in their descriptions, specifically garden hoses with a storage strap. When
the merchandiser adds link, chain, strap
to the above synonym, a
shopper searching for the same chain hoist receives chain
hoists in the response, but also garden hoses.
The suggested way of addressing the strap
to link
requirement is to add strap
and link
to the
keyword field of all the tie-downs instead of using synonyms:
Synonyms are generally global, while keywords are only specific to the assigned products or items.
Sample use case B: Size of result set before and after synonym expansion
The following example describes how the search result set after synonym expansion could be different than the combined total from each individual term out of the synonym list. Consider the following search results:
When a customer searches for drill, 1723 results are returned.
non-existing
because non-existing drill
produces a null result."metaData": {
"price": "1",
"searchPhrase": {
"original": "non-existing drill",
"adjusted": "drill"
},
"spellcheck": []
},
Consider another similar search with electric drill and cordless drill, producing 130 hits and 278 hits respectively:
Combine all these terms into one single synonym list: non-existing drill,
electric drill, cordless drill
. Now when searching for
cordless drill, the returned result set size is 342.
Search Term | Synonyms | Size of Search Result Set |
---|---|---|
drill | none | 1723 |
non-existing drill | none | 1723 |
electric drill | none | 130 |
cordless drill | none | 278 |
electric drill | non-existing drill, electric drill, cordless drill | 342 |
One would expect the size of the synonym expanded result to be at least the maximum of that from one of the synonyms in the synonym list, that is, 1723. Instead, the size returned is only 342.
“non-existing drill” OR “electric drill” OR “cordless drill”
Even though non-existing drill
does not return any search hit,
because the rest of the conditions can still generate some results, the Query
service will not auto correct the first condition. So, when combining all three
synonyms together, it is really the last two synonyms that are being used. The final
result now becomes 342 (greater than the maximum of [130, 278]).
Sample Use Case C – difference between replacement term and synonym expansion
“drum hoist => drum lifter” : “r”
This is a replacement term. When a shopper searches for drum
hoist, the Query service will automatically replace this input
phrase with drum lifter
and use it for an NLP enabled term search.
The result will be exactly the same as if the shopper entered drum
lifter.
“drum, barrel” : “s”
drum
or barrel
is
detected in the search phrase, expand this term in its original place of the search
phrase with drum OR barrel
. When a shopper searches for
drum lifter
, the final search expression will look like the
following:
“( drum OR barrel ) lifter”
“drum hoist => drum lifter” : “r”
“drum, barrel” : “s”
“drum hoist” – expression will be “( drum OR barrel ) lifter”
“drum lifter” – expression will be “( drum OR barrel ) lifter”
When using replacement terms with one-to-one relationships, NLP processing will be performed on the replaced term as well.
“barrel, tub”: “s”
When searching for either drum hoist
or drum
lifter
, the expression will remain the same: “( drum OR barrel )
lifter”
. It is because the term drum
has already been
expanded once into drum OR barrel
, and the next synonym from the
third STA barrel, tub
is ignored even though there is a matching
term barrel
in the final search expression.