analyzer index parameter

When you create a bts index, you can include the analyzer index parameter to set the default analyzer and any specific analyzers for specific fields.

The analyzer index parameter

1  analyzer = "
1 ?  field : analyzer
1  ( + ,?  field : analyzer )
1  file : directory / filename
1  table : table . column
2  "

Table 1. Options for the analyzer index parameter
Element Description
analyzer The name of the analyzer. Possible values:
  • standard: Default. Processes alphabetic characters, special characters, and numbers with stopwords.
  • alnum: Processes strings of numbers and characters into tokens.
  • alnum+characters: Includes the specified characters in tokens. List characters without spaces. The maximum length of the character list is 128 bytes.
  • cjk: Processes Chinese, Japanese, and Korean text. Ignores surrogates.
  • cjk.ws: Processes Chinese, Japanese, and Korean text. Processes surrogates.
  • esoundex: Processes text into pronunciation codes.
  • keyword: Processes input text as a single token and adds trailing white spaces as necessary for fixed-length data type columns.
  • keyword.rt: Processes input text as a single token and removes trailing white spaces.
  • simple: Processes alphabetic characters only. Ignores stopword lists.
  • snowball: Processes text into stem words.
  • snowball.language: Processes text into stem words in the specified language.
  • soundex: Processes text into pronunciation codes.
  • stopword: Processes alphabetic characters only, except stopwords.
  • udr.function_name: Creates tokens according to the specified user-defined analyzer.
  • whitespace: Creates tokens that are based on white space only.
column The name of the column that contains analyzer assignments.
directory The path for the analyzer assignments file.
field The XML tag, path, or the column name that is indexed.
filename The name of the file that contains analyzer assignments.
table The name of the table that contains analyzer assignments.

Usage

To use the same analyzer for all fields or columns that are indexed when you create the bts index, include the analyzer name without a field name. To use more than one analyzer, enclose multiple analyzer and field pairs in parentheses. To use one analyzer for most fields but other analyzers for specific fields, list the first analyzer without a field and the other analyzers with fields. The first analyzer is used for all fields except the ones that are explicitly listed with analyzer assignments.

You can specify the list of analyzers by field in a table column or in a file. The file or table must be readable by the user who creates the index. Separate the field name and analyzer pairs in the file or table by commas, white spaces, new lines, or a combination of those separators. The file or table becomes read-only when the index is created. If you want to add or change analyzer assignments, you must drop and re-create the index.

Examples

The following example creates a bts index on one column and uses the CJK analyzer:
CREATE INDEX desc_idx ON products (brands  bts_char_ops)
 USING bts (analyzer="cjk") IN sbsp1;

The following example creates a bts index on two XML fields and uses a different analyzer for each field:

CREATE INDEX boats_bts
ON boats(xml_data bts_lvarchar_ops)
USING bts
(
xmltags="(skipper,boatname)" ,
analyzer="(skipper:soundex,boatname:snowball)"
)
IN bts_sbspace;