When does the query optimizer use an R-tree index?
The query optimizer can choose to use an R-tree index
when it evaluates a query if the following statements are true:
- A strategy function of the operator class is used in the WHERE clause of the query.
- One or more arguments of the strategy function are table columns with R-tree indexes associated with the operator class.
- The data type of the arguments of the strategy function specified in the WHERE clause of the query are compatible with the signature of the strategy function. The query optimizer might cast one or both arguments to other data types in an effort to make the arguments match the signature of the strategy function.
For example, the following query can use an R-tree index:
SELECT * FROM circle_tab
WHERE Contains ( circles, 'circle(-5,-10, 20)'::MyCircle );
The
query optimizer can use the R-tree index in the preceding example
for the following reasons
- The Contains function, specified in the WHERE clause of the query, is a strategy function of the MyShape_ops operator class.
- The circles column, specified in the Contains function in the WHERE clause of the query, is of data type MyCircle and has an R-tree index built on it.
- When the cast from a string data type to the MyCircle data type is applied to the second argument, the cast from MyCircle to MyShape can be internally applied to both arguments. The result of these casts matches the signature of the Contains strategy function.
The query optimizer might sometimes decide not to use
an R-tree index, even when it could be used. Consider the following
query:
SELECT * FROM circle_tab
WHERE Contains (circles, 'circle(-5,-10, 20)'::MyCircle)
AND id = 99;
If a B-tree index is on the id column, the query optimizer might use the B-tree index instead of the R-tree index. It might even decide to perform a sequential scan for a small table to avoid the overhead of using either index. The optimizer chooses which index to use, or whether to use an index at all, by comparing the cost of each option. Cost is an estimate of the number of pages that need to be accessed. The cost of using an R-tree index is calculated by using the selectivity and per-row cost functions provided by the DataBlade® module or extension. See Selectivity and cost functions for information about how to include selectivity and per-row cost functions in a DataBlade® module.
The
following query retrieves cities with names that start with San that
are located within the specified polygon. The optimizer can choose
either a B-tree index (on name), an R-tree index (on obj)
or a sequential table scan:
SELECT location FROM cities WHERE
name LIKE San% AND
Intersect(obj, 'GeoPolygon((((-49,45), (34, 48),
(3, -45), (0, -48))), ANY, ANY)');
To determine
which index was actually used, use SET EXPLAIN ON.
Important: The
query optimizer also uses statistical data on the indexed column to
decide whether to use an R-tree index. This statistical data must
be kept up-to-date and correct for the query optimizer to make a good
decision. Use the UPDATE STATISTICS command to update the statistics
for the indexed column. For more information on statistics, see Manage databases that use the R-tree secondary access method.