Filtered columns in heavy tables
A table is heavy if it occupies too many pages to fit entirely in the buffer cache. Operations on light tables are generally faster than operations on heavy tables, because light tables can be entirely retained in memory. If a column is often used to filter the rows of a heavy table, consider placing an index on it. The optimizer can use the index to select the wanted columns and avoid a sequential scan of the entire table.
Suppose you have a table that contains a large mailing list. If you find that a postal-code column is often used to filter a subset of rows, consider putting an index on that column.
This strategy yields a net savings of time only when the selectivity of the column is high; that is, when only a small fraction of rows holds any one indexed value. Nonsequential access through an index takes several more disk I/O operations than sequential access does, so if a filter expression on the column passes more than a fourth of the rows, the database server might as well read the table sequentially.
- The column is used in filter expressions in many queries or in slow queries.
- The column contains at least 100 unique values.
- Most column values appear in fewer than 10 percent of the rows.