Design the user-defined statistics

Before you begin to code a statcollect() function for a particular user-defined data type, you need to decide what it means to collect statistics on this data type.

For example, consider the following issues:
  • Do the values of the user-defined type have some ordering?

    To be able to group the values into bins of related values, the data must have some kind of implied sequence. A common use of statistics information is within a selectivity function for a query filter such as “less than" or “greater than". If the values of the user-defined data type do not have ordering, they would not logically be used in such filters. For more info, see Query selectivity.

  • How does the distribution handle SQL NULL values?

    For example, the distribution can ignore NULL values or it can aggregate them. However, the handling of the NULL values must make sense to the user-defined data type.