Considerations when using Search Term Associations
When you use Elasticsearch as your search solution, stop words are not used, spell checking is not performed on terms defined for Search Term Association (STA), and processing is performed in one direction (top to bottom). General guidelines are presented for constructing STAs, and examples are given.
General guidelines for composing synonyms and replacement terms
- Exercise care when adding synonyms and replacement terms.
- The replacement terms associated with synonyms are designed to replace
the shopper's search terms when the shopper is interacting with the
storefront with terms that match the product data and return desired
results. Synonyms are intended to be used by merchandisers, to aggregate
and return similar products with different product data names/ terms
together in the search results. The search service can handle typical
spelling mistakes and most inflected forms of terms (-ing, -ed, plurals,
etc.). Use singular terms and do not add plural / inflected term forms
unless you find the search service is returning matched results. If you
are not sure whether the search service will properly match inflected
terms for your top searches, you can check term’s “stem”: https://snowballstem.org/demo.html
- Synonyms may be used for inflected terms used in product data that are not being “stemmed” down to a matching root form of the term. (Example: “conditioner, conditioning”, “shelf, shelves”)
- Replacements should be used where the shopper’s incoming inflected search term is not matching with non-inflected terms in the product data. (Example: “welder => weld”)
- Synonyms are "global" and adding a synonym to improve one search use case may impact relevancy for other searches. In order to minimize potential search relevancy issues, improve product data and add keywords instead of using synonyms. Improvements to product data isolates the change to specific products and may also provide Search Engine Optimization (SEO) (Google cannot crawl/ index synonym data) and Key Performance Indicator (KPI) benefits.
- Use only nouns for terms and avoid duplicate/ overlapping entries.
- Do not add misspelled forms of terms as synonyms. If you find commonly misspelled search terms in search analytics that the search service is having difficulty matching, use a replacement term entry.
- Use replacement terms for abbreviations unless abbreviations are used in the product data. Synonym entries can be used to aggregate results for similar products where the abbreviations are inconsistent. (Example: “recip, reciprocate”).
- The replacement terms associated with synonyms are designed to replace
the shopper's search terms when the shopper is interacting with the
storefront with terms that match the product data and return desired
results. Synonyms are intended to be used by merchandisers, to aggregate
and return similar products with different product data names/ terms
together in the search results. The search service can handle typical
spelling mistakes and most inflected forms of terms (-ing, -ed, plurals,
etc.). Use singular terms and do not add plural / inflected term forms
unless you find the search service is returning matched results. If you
are not sure whether the search service will properly match inflected
terms for your top searches, you can check term’s “stem”: https://snowballstem.org/demo.html
- Other guidelines when creating synonyms:
- Use either the Management Center Search Term Association (STA) tool or the query configuration REST API. Note that synonyms added using the synonyms API will not be visible in the Management Center STA tool.
- Replacement terms are processed before synonyms and synonym expansion is not performed on the replaced keywords.
- Keep synonym entries as short and as simple as possible. Use single-word
synonyms whenever possible to simplify similar multi-word terms. For
example, consider the following multi-word synonyms used for describing
a hoist for lifting a
vehicle.
Original:
Simplified:vehicle lift, vehicle hoist, car lift, car hoist, automobile lift vehicle ramp, car ramp, automobile ramp
automobile, car, vehicle hoist, lift
- Use consistent terms usage in the product data and synonyms and
replacement terms (For example:
bandsaw
versusband saw
,e-track
versus etrack, units of measure (eg., in, “, inch) in product names). - Place the most significant terms at the beginning of the synonyms entry when long tail search is enabled.
- Ensure that the number of synonyms in a given synonym entry does not exceed the SynonymExpansionThreshold (default of 20 terms) configuration setting.
Sample Use Case A – difference between keywords and synonyms
Suppose there is a hoist
category with chain hoist products, and
another gardening category with garden hose products. A shopper searches for
chain hoists, and the result is that 486 products are
found. Later there is a requirement to associate the term link
to
chain
. But since there are over 400 products involved, the
merchandiser chooses to add a simple synonym link, chain
.
Next, another requirement arises, related to associating strap
to
link
for a small number of tie-down products. The merchandiser
was not aware that there are several products in the gardening category that have
"strap" in their descriptions, specifically garden hoses with a storage strap. When
the merchandiser adds link, chain, strap
to the above synonym, a
shopper searching for the same chain hoist receives chain
hoists in the response, but also garden hoses.
The suggested way of addressing the strap
to link
requirement is to add strap
and link
to the
keyword field of all the tie-downs instead of using synonyms:
Synonyms are generally global, while keywords are only specific to the assigned products or items.
Sample use case B: Size of result set before and after synonym expansion
The following example describes how the search result set after synonym expansion could be different than the combined total from each individual term out of the synonym list. Consider the following search results:
When a customer searches for drill, 1723 results are returned.
non-existing
because non-existing drill
produces a null result."metaData": {
"price": "1",
"searchPhrase": {
"original": "non-existing drill",
"adjusted": "drill"
},
"spellcheck": []
},
Consider another similar search with electric drill and cordless drill, producing 130 hits and 278 hits respectively:
Combine all these terms into one single synonym list: non-existing drill,
electric drill, cordless drill
. Now when searching for
cordless drill, the returned result set size is 342.
Search Term | Synonyms | Size of Search Result Set |
---|---|---|
drill | none | 1723 |
non-existing drill | none | 1723 |
electric drill | none | 130 |
cordless drill | none | 278 |
electric drill | non-existing drill, electric drill, cordless drill | 342 |
One would expect the size of the synonym expanded result to be at least the maximum of that from one of the synonyms in the synonym list, that is, 1723. Instead, the size returned is only 342.
“non-existing drill” OR “electric drill” OR “cordless drill”
Even though non-existing drill
does not return any search hit,
because the rest of the conditions can still generate some results, the Query
service will not auto correct the first condition. So, when combining all three
synonyms together, it is really the last two synonyms that are being used. The final
result now becomes 342 (greater than the maximum of [130, 278]).
Sample Use Case C – difference between replacement term and synonym expansion
“drum hoist => drum lifter” : “r”
This is a replacement term. When a shopper searches for drum
hoist, the Query service will automatically replace this input
phrase with drum lifter
and use it for an NLP enabled term search.
The result will be exactly the same as if the shopper entered drum
lifter.
“drum, barrel” : “s”
drum
or barrel
is
detected in the search phrase, expand this term in its original place of the search
phrase with drum OR barrel
. When a shopper searches for
drum lifter
, the final search expression will look like the
following:
“( drum OR barrel ) lifter”
“drum hoist => drum lifter” : “r”
“drum, barrel” : “s”
“drum hoist” – expression will be “( drum OR barrel ) lifter”
“drum lifter” – expression will be “( drum OR barrel ) lifter”
When using replacement terms with one-to-one relationships, NLP processing will be performed on the replaced term as well.
“barrel, tub”: “s”
When searching for either drum hoist
or drum
lifter
, the expression will remain the same: “( drum OR barrel )
lifter”
. It is because the term drum
has already been
expanded once into drum OR barrel
, and the next synonym from the
third STA barrel, tub
is ignored even though there is a matching
term barrel
in the final search expression.
For a more detailed discussion of these examples as well as the programming logic underlying STA expansion and processing, see Expanding synonyms and Search Term Associations at query time.