Example of a join with column filters

The presence of a column filter can change the query plan. A column filter is a WHERE expression that reduces the number of rows that a table contributes to a join.

The following example shows the query described in Example of query-plan execution with a filter added:

SELECT C.customer_num, O.order_num
   FROM customer C, orders O, items I
   WHERE C.customer_num = O.customer_num
      AND O.order_num = I.order_num
      AND O.paid_date IS NULL

The expression O.paid_date IS NULL filters out some rows, reducing the number of rows that are used from the orders table. Consider a plan that starts by reading from orders. Query plan that uses a column filter displays this sample plan in pseudocode.

Figure 1: Query plan that uses a column filter

for each row in the orders table do:
   read the row into O
   if O.paid_date is null then
      for each row in the customer table do:
         read the row into C
         if O.customer_num = C.customer_num then
            for each row in the items table do:
               read the row into I
               if I.order_num = O.order_num then
                  accept row and return to user
               end if
            end for
         end if
      end for
   end if
end for

Let pdnull represent the number of rows in orders that pass the filter. It is the value of COUNT(*) that results from the following query:

SELECT COUNT(*) FROM orders WHERE paid_date IS NULL

If one customer exists for every order, the plan in Query plan that uses a column filter reads the following rows:

All rows of the orders table once
All rows of the customer table, pdnull times
All rows of the items table, pdnull times

The alternative query plan in pseudocode shows an alternative execution plan that reads from the customer table first.

Figure 2: The alternative query plan in pseudocode

for each row in the customer table do:
   read the row into C
   for each row in the orders table do:
      read the row into O
      if O.paid_date is null and
           O.customer_num = C.customer_num then
         for each row in the items table do:
            read the row into I
            if I.order_num = O.order_num then
               accept row and return to user
            end if
         end for
      end if
   end for

Because the filter is not applied in the first step that The alternative query plan in pseudocode shows, this plan reads the following rows:

All rows of the customer table once
All rows of the orders table once for every row of customer
All rows of the items table, pdnull times

The query plans in Query plan that uses a column filter and The alternative query plan in pseudocode produce the same output in a different sequence. They differ in that one reads a table pdnull times, and the other reads a table SELECT COUNT(*) FROM customer times. By choosing the appropriate plan, the optimizer can save thousands of disk accesses in a real application.