Whitespace analyzer
The Whitespace analyzer processes characters into tokens based on whitespaces. All characters between whitespaces are indexed without alteration.
The Whitespace analyzer processes text characters in the following ways:
- Stopword lists are ignored. All words are indexed.
- Does not change letter case.
- Indexes numbers and special characters.
Because the Whitespace analyzer does not support stopwords, omit the word TO from range queries.
Examples
In the following examples, the input text is shown on the first line and the resulting indexed tokens, which are surrounded by square brackets, are shown on the second line.
In the following example, all words are indexed exactly as they are:
The Quick Brown Fox Jumped Over The Lazy Dog
[The] [Quick] [Brown][Fox] [Jumped][Over] [The] [Lazy] [Dog]
In the following example, all numbers and special characters are indexed:
-12 -.345 -898.2 -56. –
[-12] [-.345] [-898.2] [-56.] [-]
In the following example, the e-mail address is indexed as one token:
xyz@example.com
[xyz@example.com]