Document scoring for fuzzy searches
When you perform a fuzzy search, some of the rows that the text search engine returns might satisfy search criteria better than others. To determine the degree of similarity between your clue and each of the rows returned, you can instruct the search engine to assign a value, called a document score, to each of the rows.
- It always scores exact matches slightly higher than pattern matches.
- It scores all pattern matches equally, even if some matches appear to approximate the clue better than others.
Document scores vary from 0 to 100, with 0 indicating no match and 100 indicating an exact match. Values 0 - 100 indicate approximate matches; the higher the value, the closer the match. A null value indicates that the row was not ranked. This could happen if the row was returned because of a non-etx_contains() operator in a WHERE clause that contains an OR predicate.
To access document score information, use a statement local variable (SLV) as the optional third parameter to the etx_contains() operator. The data type of the SLV is etx_ReturnType, a row type derived by HCL OneDB™ that consists of two fields. The scoring information is contained in the score field.
The score field contains a numeric value that indicates the relevance of a returned document to the search criteria, compared to that of other indexed records. The higher the document score value, the more closely the document matches the criteria.
SELECT rc.score, id, description FROM videos
WHERE etx_contains(description,
Row('multimedia' ,
'PATTERN_TRANS & PATTERN_SUBS'), rc # etx_ReturnType)
ORDER BY 1;
The SLV has a scope that is limited to the statement in which you use it. It is a way for the Excalibur text search engine to send back information about the search it just performed to the etx_contains() operator that called it.
Although the data type of the SLV is always etx_ReturnType, you must still explicitly specify its type when you use it in the etx_contains() operator, as shown in the example. The example also shows how you can use the ORDER BY clause to instruct the database server to rank the rows it returns by the score field of the SLV.
For more information about the etx_ReturnType data type, its two fields, and how to use the information contained in the fields, see The etx_ReturnType data type. For more general information about SLVs, see the HCL OneDB Guide to SQL: Syntax.