Self-joins

A join does not always have to involve two different tables. You can join a table to itself, creating a self-join. Joining a table to itself can be useful when you want to compare values in a column to other values in the same column.

To create a self-join, list a table twice in the FROM clause, and assign it a different alias each time. Use the aliases to refer to the table in the Projection and WHERE clauses as if it were two separate tables. (Aliases in SELECT statements are discussed in Aliases and in the HCL OneDB™ Guide to SQL: Syntax.)

Just as in joins between tables, you can use arithmetic expressions in self-joins. You can test for null values, and you can use an ORDER BY clause to sort the values in a specified column in ascending or descending order.

The following query finds pairs of orders where the ship_weight differs by a factor of five or more and the ship_date is not null. The query then orders the data by ship_date.
Figure 1: Query
SELECT x.order_num, x.ship_weight, x.ship_date,
       y.order_num, y.ship_weight, y.ship_date 
   FROM orders x, orders y
   WHERE x.ship_weight >= 5 * y.ship_weight
      AND x.ship_date IS NOT NULL
      AND y.ship_date IS NOT NULL
   ORDER BY x.ship_date;
Table 1. Query result
order_num ship_weight ship_date order_num ship_weight ship_date
1004 95.80 05/30/1998 1011 10.40 07/03/1998
1004 95.80 05/30/1998 1020 14.00 07/16/1998
1004 95.80 05/30/1998 1022 15.00 07/30/1998
1007 125.90 06/05/1998 1015 20.60 07/16/1998
1007 125.90 06/05/1998 1020 14.00 07/16/1998

If you want to store the results of a self-join into a temporary table, append an INTO TEMP clause to the SELECT statement and assign display labels to at least one set of columns to rename them. Otherwise, the duplicate column names cause an error and the temporary table is not created.

The following query, which is similar to Query, labels all columns selected from the orders table and puts them in a temporary table called shipping.
Figure 2: Query
SELECT x.order_num orders1, x.po_num purch1, 
       x.ship_date ship1, y.order_num orders2, 
       y.po_num purch2, y.ship_date ship2
   FROM orders x, orders y
   WHERE x.ship_weight >= 5 * y.ship_weight
      AND x.ship_date IS NOT NULL
      AND y.ship_date IS NOT NULL
   ORDER BY orders1, orders2
   INTO TEMP shipping;
If you query with SELECT * from table shipping, you see the following rows.
Figure 3: Query result
    orders1 purch1     ship1          orders2 purch2    ship2

       1004 8006       05/30/1998        1011 B77897    07/03/1998
       1004 8006       05/30/1998        1020 W2286     07/16/1998
       1004 8006       05/30/1998        1022 W9925     07/30/1998
       1005 2865       06/09/1998        1011 B77897    07/03/1998
       ;
       1019 Z55709     07/16/1998        1020 W2286     07/16/1998
       1019 Z55709     07/16/1998        1022 W9925     07/30/1998
       1023 KF2961     07/30/1998        1011 B77897    07/03/1998

You can join a table to itself more than once. The maximum number of self-joins depends on the resources available to you.

The self-join in the following query creates a list of those items in the stock table that are supplied by three manufacturers. The self-join includes the last two conditions in the WHERE clause to eliminate duplicate manufacturer codes in rows that are retrieved.
Figure 4: Query
SELECT s1.manu_code, s2.manu_code, s3.manu_code, 
       s1.stock_num, s1.description
   FROM stock s1, stock s2, stock s3
   WHERE s1.stock_num = s2.stock_num
      AND s2.stock_num = s3.stock_num
      AND s1.manu_code < s2.manu_code
      AND s2.manu_code < s3.manu_code
   ORDER BY stock_num;
Figure 5: Query result
manu_code manu_code manu_code stock_num description

HRO       HSK       SMT               1 baseball gloves
ANZ       NRG       SMT               5 tennis racquet 
ANZ       HRO       HSK             110 helmet
ANZ       HRO       PRC             110 helmet
ANZ       HRO       SHM             110 helmet
ANZ       HSK       PRC             110 helmet
ANZ       HSK       SHM             110 helmet
ANZ       PRC       SHM             110 helmet
HRO       HSK       PRC             110 helmet
HRO       HSK       SHM             110 helmet
HRO       PRC       SHM             110 helmet
;
KAR       NKL       PRC             301 running shoes
KAR       NKL       SHM             301 running shoes
KAR       PRC       SHM             301 running shoes
NKL       PRC       SHM             301 running shoes
If you want to select rows from a payroll table to determine which employees earn more than their manager, you might construct the self-join as the following SELECT statement shows:
SELECT emp.employee_num, emp.gross_pay, emp.level, 
       emp.dept_num, mgr.employee_num, mgr.gross_pay, 
       mgr.dept_num, mgr.level
   FROM payroll emp, payroll mgr
   WHERE emp.gross_pay > mgr.gross_pay 
      AND emp.level < mgr.level
      AND emp.dept_num = mgr.dept_num
   ORDER BY 4;
The following query uses a correlated subquery to retrieve and list the 10 highest-priced items ordered.
Figure 6: Query
SELECT order_num, total_price 
   FROM items a
   WHERE 10 >
      (SELECT COUNT (*) 
         FROM items b
         WHERE b.total_price < a.total_price)
   ORDER BY total_price;
The query returns the 10 rows.
Figure 7: Query result
    order_num    total_price

       1018      $15.00
       1013      $19.80
       1003      $20.00
       1005      $36.00
       1006      $36.00
       1013      $36.00
       1010      $36.00
       1013      $40.00
       1022      $40.00
       1023      $40.00

You can create a similar query to find and list the 10 employees in the company who have the most seniority.

For more information about correlated subqueries, refer to Subqueries in SELECT statements.