Managing synonyms, stop words, and Search Term Associations
ZooKeeper is used to manage several kinds of customizable lists of associated terms used by the search function. Stop words are removed from the query before it is processed. Synonyms and Search Term Associations (STAs) implement catalog substitutions for query terms in slightly different ways. Each custom list is accessible and you can directly change stop words and synonyms in ZooKeeper.
Introduction
- The Stop words list records all those words that are to be filtered out of the search query before Natural Language Processing is performed on it. This list usually contains the most common words in a language (such as "the" for English).
- Synonyms and Search Term Associations increase the scope of search results by adding additional search terms to search submissions. The search results include the submitted search term, plus the search results for the additional defined synonyms or STAs. Although they are processed in the same way, they are separate lists that are generated by different mechanisms. The STA mechanism provides a backwards-compatible approach whereby the synonyms are loaded directly from the database. The synonyms mechanism uses ZooKeeper and is intended to provide more options for managing associations in the Elasticsearch environment.
JSON
format in ZooKeeper, in
language-specific dictionaries. The following sections describe the structure of
these dictionaries, and how you can interact with them in ZooKeeper using the REST
API.The Stop Words dictionary
You interact with the Stop Words dictionary using REST calls. The permitted calls are GET, POST, and PATCH. For example, in the case of a GET call, the response body contains a json-formatted set of the terms you are calling. There is no explicit DELETE call; however, you can simply do a POST with empty content to delete an item.
http://data_environment_hostname:30920/search/resources/api/v2/configuration?nodeName=environmentType_storeID_product_stopwords&locale=en_US
Where
the environmentType is either or
is:http://data_environment_hostname:30920/search/resources/api/v2/configuration?nodeName=environmentType_storeID_product_stopwords&locale=en_US
Where
the environmentType is either auth or
live. {
"the": "",
"and":""
}
"http://search_server:30920/search/resources/api/v2/configuration?nodeName&locale-en_US
where
nodeName is the flattenednamespace for the index.Updating synonyms and replacement terms
You can also update synonyms and replacement terms using the query service configuration REST API. The permitted calls are GET, POST, and PATCH. For example, in the case of a GET call, the response body contains a json-formatted set of the terms you are calling. There is no explicit DELETE call; however, you can simply do a POST with empty content to delete an item.
http://data_environment_hostname:30920/search/resources/api/v2/configuration?nodeName=environmentType_storeID_product_sta&locale=en_US
Your
reply will contain a set of synonyms, as in the following
example.{
"couch, sofa": "" ,
"coff => coffee": "" ,
"driveway, road, street": "" ,
...}
- expand = true
- lenient = false
For information on how to tune synonym processing, see Synonym-related configurations.
Working with STAs
Search Term Associations are functionally the same as synonyms and they are stored in the same format in ZooKeeper. STAs are treated the same by Elasticsearch as they are by Solr; for more information, see Search term associations. Elasticsearch STAs are generated using the same mechanism as the Solr search engine, rather than by the REST call mechanism used for synonyms and stop words. The process is that when an STA is saved in Management Center for HCL Commerce, a near realtime update is triggered and overwrites the existing STA list in ZooKeeper with a new list from the database.
You can do a Get to verify the changes.
{
"couch, sofa": ""
...
}