Hyphenated Search Term Associations (STA) Support
The HyphenatedSTA provider is responsible for parsing your search term and applying replacements based on Natural Language Processing (NLP). This allows the seamless handling of search terms containing hyphenated words, both with and without hyphens, enhancing search accuracy and user experience.
Implementation Overview
- Configuration Parameter
- The configuration parameter nlp.enable.hyphenated.ner
controls the activation of hyphenated STA support. By default, this
parameter is set to
true
. - Custom Named Entity Recognition (NER)
- The custom Hyphenated NER classification identifies hyphenated words along
with their variations. It encompasses both those with and without hyphens
within the
Name/Description
fields in the index. These NER tags are generated during the custom_ner file creation process. - Query Time Processing
- During query processing, search terms are checked against the Custom NER to
identify hyphenated words. Hyphenated terms are recognized and processed
similarly to the way in which other entities such as
BRAND
andMANUFACTURER
are handled. - Hyphenated Term Replacement:
- For search terms containing hyphenated words without hyphens, appropriate replacements are added based on NER classification. This ensures consistency in search queries, regardless of the presence or absence of hyphens in the input.
Consider the following example to illustrate the implementation:
GET
http://hostname:port/search/resources/api/v2/products?storeId=2&searchTerm=Energy
efficient dual tube bulb
Response Metadata:
"metaData": {
...
"searchExecution": [
{
"searchTerm": "Energy efficient dual tube bulb",
"sta": "bulb : [bulb OR led]",
"hyphenatedSta": "energy efficient --> (energy-efficient) | dual tube --> (dual-tube)",
"searchRule": {},
"nlp": {
"pos": "NOUN --> [energy-efficient, dual-tube, bulb] (boosted by 500.0)"
},
"customFields": {}
}
]
}
In this example:
- The search term
"Energy efficient dual tube bulb"
contains hyphenated words without hyphens. - The hyphenatedSta field specifies the corresponding hyphenated replacements for the detected hyphenated terms.
- The NLP (Natural Language Processing) analysis identifies the relevant nouns and boosts their relevance accordingly.
Supported Hyphenated Terms
The implementation supports hyphenated terms with two formats:
- Hyphenated words with spaces (e.g.,
"fixed height"
). - Hyphenated words without spaces (e.g.,
"fixedheight"
).
It also supports variations of hyphenated terms (e.g.,
"hand-operated"
, "hand operated"
,
"handoperated"
).
Limitations
-
- Note that the current solution specifically targets the Name and Description fields and does not extend to other fields containing hyphens.
- Additionally, the current implementation only supports the ESite Indexing model and does not include support for the CAS model.