Manually adding languages when using Basic NLP
About this task
After an upgrade to HCL Commerce Search Version
9.1.14.1, verify the integrity of your Ingest node data using the
/configuration
REST
endpoint.GET - http://dataQueryHost:dataQueryPort/search/resources/api/v2/configuration?nodeName=ingest&envType=auth
The Ingest node must include configuration properties and all default values for the
following stemmer and stopword languages.{
"name": "stemmer.language",
"value": "
{\"ar_EG\": \"Arabic\", \"da_DK\": \"Danish\", \"de_DE\": \"German\", \"en_US\": \"English\", \"es_ES\": \"Spanish\", \"fi_FI\": \"Finnish\", \"fr_FR\": \"French\", \"hu_HU\": \"Hungarian\", \"it_IT\": \"Italian\", \"nb_no\": \"Norwegian\", \"nl_NL\": \"Dutch\", \"pt_BR\": \"Portuguese\", \"ro_RO\": \"Romanian\", \"ru_RU\": \"Russian\", \"sv_SE\": \"Swedish\", \"tr_TR\": \"Turkish\"}
"
},
{
"name": "stopword.language",
"value": "
{ \"ar\": \"_arabic_\", \"da\": \"_danish_\", \"de\": \"_german_\", \"el\": \"_greek_\", \"en\": \"_english_\", \"es\": \"_spanish_\", \"fi\": \"_finnish_\", \"fr\": \"_french_\", \"hu\": \"_hungarian_\", \"it\": \"_italian_\", \"ja\": \"_cjk_\", \"ko\": \"_cjk_\", \"nb\": \"_norwegian_\", \"nl\": \"_dutch_\", \"pt\": \"_portuguese_\", \"ro\": \"_romanian_\", \"ru\": \"_russian_\", \"sv\": \"_swedish_\", \"tr\": \"_turkish_\", \"zh\": \"_cjk_\"}
"
}
If the stemmer.language or stopword.language
properties are not present in the Ingest node, update these using the
/configuration
REST endpoint.
Procedure
-
If you need to add any existing or new languages to the two properties, append
them to the existing values in valid
.json
format.Use the PATCH method on the following REST endpoint:
To restore the full set of eighteen languages supported in Basic NLP, use the following body for the request:PATCH http://dataQueryHost:dataQueryPort/search/resources/api/v2/configuration?nodeName=ingest&envType=auth
{ "global": { "connector": [ { "name": "attribute", "property": [ { "name": "stemmer.language", "value": " {\"ar_EG\": \"Arabic\", \"da_DK\": \"Danish\", \"de_DE\": \"German\", \"en_US\": \"English\", \"es_ES\": \"Spanish\", \"fi_FI\": \"Finnish\", \"fr_FR\": \"French\", \"hu_HU\": \"Hungarian\", \"it_IT\": \"Italian\", \"nb_no\": \"Norwegian\", \"nl_NL\": \"Dutch\", \"pt_BR\": \"Portuguese\", \"ro_RO\": \"Romanian\", \"ru_RU\": \"Russian\", \"sv_SE\": \"Swedish\", \"tr_TR\": \"Turkish\"} " }, { "name": "stopword.language", "value": " { \"ar\": \"_arabic_\", \"da\": \"_danish_\", \"de\": \"_german_\", \"el\": \"_greek_\", \"en\": \"_english_\", \"es\": \"_spanish_\", \"fi\": \"_finnish_\", \"fr\": \"_french_\", \"hu\": \"_hungarian_\", \"it\": \"_italian_\", \"ja\": \"_cjk_\", \"ko\": \"_cjk_\", \"nb\": \"_norwegian_\", \"nl\": \"_dutch_\", \"pt\": \"_portuguese_\", \"ro\": \"_romanian_\", \"ru\": \"_russian_\", \"sv\": \"_swedish_\", \"tr\": \"_turkish_\", \"zh\": \"_cjk_\"} " } ] } ] } }
- Verify the results using the query described in About this task. If you add a stemmer in the property list for a language that Elasticsearch does not support, then Elasticsearch will generate an exception at ingest time. The following screen shot provides an example.