Conditions in the CONNECT BY Clause

Besides expressions and operators that are valid in Boolean conditions and in general SQL expressions, the condition that is specified in CONNECT BY clause supports two more syntax constructs, the PRIOR operator and the LEVEL pseudocolumn that are valid only in SELECT statements that include the Hierarchical clause.

The PRIOR Operator

The PRIOR unary operator can be included in the CONNECT BY clause with a column name as its operand. PRIOR can be used to distinguish column references to the result of the most recent previous recursive step of the CONNECT BY clause from column references to the current result set. The column name immediately follows this right-associative operator, as in the following syntax fragment:
	CONNECT BY mgrid = PRIOR empid
Here the CONNECT BY condition is satisfied by those rows in which the manager specified in the mgrid, column matches the employee value was in the empid column in the previous iteration.
The PRIOR operator can be applied to expressions more complex than column names. The following condition uses an arithmetic expression as the operand of PRIOR:
	CONNECT BY PRIOR (salary - 10000) = salary

The PRIOR operator can be included more than once in the same CONNECT BY condition. See also the topic Hierarchical Clause, which provides an example of a hierarchical query that uses the PRIOR operator in a condition of the CONNECT BY clause.

The LEVEL Pseudocolumn

A pseudocolumn is a keyword of SQL that shares the same namespace as column names, and that is valid in some contexts where a column expression is valid.

LEVEL is a pseudocolumn that returns the ordinal number of the recursive step in the Hierarchic clause that returned the row. For all the rows returned by the START WITH clause, LEVEL return the value 1. Rows that are returned by applying the first iteration of the CONNECT BY clause return 2. Rows that are returned by successive iterations of the CONNECT BY have LEVEL values incremented by 1, so that LEVEL = (N + 1) indicates a row that the Nth CONNECT BY iteration returned. The data type of the LEVEL column is INTEGER.

The following example of a hierarchical query specifies LEVEL in the select list of the Projection clause:
SELECT name, LEVEL FROM employee START WITH name = 'Goyal' 
   CONNECT BY PRIOR empid = mgrid;
The query returns these results:

name          level

Goyal             1
Zander            2
McKeough          3
Barnes            3
Henry             2
O'Neil            3
Smith             3
Shoeman           3
Scott             2

9 row(s) retrieved.

LEVEL can be included in the Projection clause of SELECT statements that include the Hierarchical clause, and in the condition of the CONNECT BY clause.

The LEVEL pseudocolumn is not valid, however, in the following contexts:
  • A SELECT statement that has no CONNECT BY clause
  • The START WITH condition of the Hierarchical clause
  • An operand of the CONNECT_BY_ROOT operator
  • An argument to the SYS_CONNECT_BY_PATH function.

Additional Syntax Valid Only in Hierarchical Queries

The following syntax tokens support hierarchical queries, and are valid only in hierarchical queries. Unlike the PRIOR operator and the LEVEL pseudocolumn, however, they are not valid in the Hierarchical clause:
  • The CONNECT_BY_ISCYCLE pseudocolumn
  • The CONNECT_BY_ISLEAF pseudocolumn
  • The CONNECT_BY_ROOT unary operator
  • The SYS_CONNECT_BY_PATH( ) function of SQL.

The CONNECT_BY_ISCYCLE Pseudocolumn

CONNECT_BY_ISCYCLE is a pseudocolumn that returns a 1 if the row would cycle at the next level in the hierarchy. That is, the row has an immediate child that is also an ancestor given the search-condition that is specified in the CONNECT BY clause. If the row does not directly cause a cycle, the column returns 0. A value other than 0 is only possible when NOCYCLE is specified in the CONNECT BY clause. The data type of this column is INTEGER.

The following UPDATE statement creates a loop in the data hierarchy of the employee table:

UPDATE employee SET mgrid = 5 WHERE empid = 17;

The following hierarchical query includes the CONNECT_BY_ISCYCLE pseudocolumn in the Projection clause, but the CONNECT BY clause throws an error in the step where it encounters the loop that the UPDATE statement created.

SELECT empid, 
       name, 
       mgrid, 
       CONNECT_BY_ISLEAF leaf, 
       CONNECT_BY_ISCYCLE cycle
FROM  employee
START WITH name = 'Goyal'
CONNECT BY  PRIOR empid = mgrid;

  665: Internal error on semantics - 
  CONNECT_BY_ISCYCLE is used without NOCYCLE parameter..
Error in line 1
Near character position 72

This query avoids the -655 error by specifying the NOCYCLE in the CONNECT BY clause that throws an error in the step where it encounters the loop that the UPDATE statement created.

SELECT empid, name, mgrid, 
   CONNECT_BY_ISLEAF leaf, CONNECT_BY_ISCYCLE cycle
FROM employee
   START WITH name = 'Goyal'
   CONNECT BY NOCYCLE PRIOR empid = mgrid;

For the results of this query, see the example in the description of the NOCYCLE keyword in the topic CONNECT BY Clause.

The CONNECT_BY_ISCYCLE pseudocolumn is not valid in the following contexts:
  • A SELECT statement that has no CONNECT BY clause
  • The START WITH or CONNECT BY clause
  • An operand of the CONNECT_BY_ROOT operator
  • An argument to the SYS_CONNECT_BY_PATH function

The CONNECT_BY_ISLEAF Pseudocolumn

CONNECT_BY_ISLEAF is a pseudocolumn that returns a 1 if the row is a leaf in the hierarchy as defined by the CONNECT BY clause. A node is a leaf node if it has no children in the query result hierarchy (not in the actual data hierarchy). If the row is not a leaf the column returns 0. The data type of the column is INTEGER.

The following hierarchical query specifies the CONNECT_BY_ISLEAF pseudocolumn in the Projection clause, and declares leaf as an alias for that column, which shows in the DB-Access display of the result set:

SELECT empid, name, mgrid, CONNECT_BY_ISLEAF leaf
FROM emp1oyee
START WITH name = 'Goyal'
CONNECT BY PRIOR empid = mgrid;

      empid name             mgrid   leaf 

         16 Goyal               17      0
         14 Scott               16      1
         12 Henry               16      0
          9 Shoeman             12      1
          8 Smith               12      1
          7 O'Neil              12      1
         11 Zander              16      0
          6 Barnes              11      1
          5 McKeough            11      1

9 row(s) retrieved.

The CONNECT_BY_ROOT Operator

For every row in the hierarchy, the CONNECT_BY_ROOT unary operator accepts as its operand an expression that evaluates to a row that is a node of the hierarchy. CONNECT_BY_ROOT returns the expression for the root ancestor of its operand. This syntax fragment is part of the Hierarchical Clause.

CONNECT_BY_ROOT Operator

CONNECT_BY_ROOT expression
The expression operand can be any SQL expression, but it must not contain any hierarchical query token, including the following tokens:
  • The CONNECT_BY_ROOT or PRIOR unary operators
  • The CONNECT_BY_ISCYCLE, CONNECT_BY_ISLEAF, or LEVEL pseudocolumns
  • The SYS_CONNECT_BY_PATH function.
The return data type of this right-associative operator is the data type of the specified expression.

The hierarchical query in the following example returns rows from the employee table that include both the identifying number of the manager to whom the employee reports directly, and also the name of the manager at the root of the hierarchy for this query.

SELECT empid, name, mgrid, CONNECT_BY_ROOT name AS topboss
FROM employee
START WITH name = 'Goyal'
CONNECT BY PRIOR empid = mgrid;

      empid name             mgrid topboss    

         16 Goyal               17 Goyal     
         14 Scott               16 Goyal     
         12 Henry               16 Goyal     
          9 Shoeman             12 Goyal     
          8 Smith               12 Goyal     
          7 O'Neil              12 Goyal     
         11 Zander              16 Goyal     
          6 Barnes              11 Goyal     
          5 McKeough            11 Goyal     

9 row(s) retrieved.
The CONNECT_BY_ROOT operator is not valid in the following contexts:
  • A SELECT statement that has no CONNECT BY clause
  • The START WITH or CONNECT BY clause
  • An argument to the SYS_CONNECT_BY_PATH function

The SYS_CONNECT_BY_PATH Function

Calls to SYS_CONNECT_BY_PATH ( ) function are valid in SELECT statements that include the Hierarchical clause, but this function cannot be called from the Hierarchical clause. Informix® returns an error if you attempt to run this function within the condition of the START WITH or CONNECT BY clauses.

The SYS_CONNECT_BY_PATH function can be used in hierarchical queries to build a string that represents a path from the row corresponding to the root node to the current row.

This is the calling syntax for SYS_CONNECT_BY_PATH to return a string for a specified row at LEVEL N. This syntax fragment is part of the Hierarchical Clause.

SYS_CONNECT_BY_PATH Function

SYS_CONNECT_BY_PATH ( string_expression , ' format_string ' )
Element Description Restrictions Syntax
format_string Typically a constant string that serves as a separator None Quoted String
string_expression An expression that identifies a row. Cannot include hierarchical query tokens Expression
SYS_CONNECT_BY_PATH builds the string representation of the path from the root to a specified row at LEVEL N of the hierarchy by recursively concatenating the successive returned values:
  • path1 := string_expression1||format_string represents the path to the root row from the first intermediate result set,
  • path2 := path1||string_expression2||format_string evaluates to the path from the root to a row in the second intermediate result set,
  • . . .
  • pathN := path(N-1)||string_expressionN||format_string evaluates to the path from the root to the Nth intermediate result set.
The expressions in arguments to SYS_CONNECT_BY_PATH must not include any hierarchical query construct, including the following constructs:
  • The CONNECT_BY_ROOT or PRIOR unary operators
  • The CONNECT_BY_ISCYCLE, CONNECT_BY_ISLEAF, or LEVEL pseudocolumns
  • The SYS_CONNECT_BY_PATH function.
Also not valid in the argument list are aggregate expressions.

The return value from SYS_CONNECT_BY_PATH ( ) is of type LVARCHAR(4000).

The hierarchical query in the following example calls the SYS_CONNECT_BY_PATH function in the Projection list with the employee.name column and the slash ( / ) character as its arguments.

SELECT empid, name, mgrid, SYS_CONNECT_BY_PATH( name,'/') as hierarchy
   FROM employee
   START WITH name = 'Henry'
   CONNECT BY PRIOR empid = mgrid;

The query returns the rows within the subset of the data hierarchy in which Henry is specified as the root in the START WITH clause, showing the name and empid number of each employee and of the employee's manager, and the path within the hierarchy to Henry. The CONNECT BY clause uses the equality predicate PRIOR empid = mgrid to return the employees who report to the managers (in this case, onlyHenry ) whose empid was returned by the previous step. The result set of the query is:

empid      12
name       Henry
mgrid      16
hierarchy  /Henry 

empid      9
name       Shoeman
mgrid      12
hierarchy  /Henry/Shoeman 

empid      8
name       Smith
mgrid      12
hierarchy  /Henry/Smith 

empid      7
name       O'Neil
mgrid      12
hierarchy  /Henry/O'Neil 

4 row(s) retrieved.
These rows are listed in the order in which they were retrieved:
  • The START WITH clause returned the Henry row at the root of this hierarchy.
  • The first step of the CONNECT BY clause returned three rows, corresponding to the three employees who report to Henry.
  • The next CONNECT BY step returned no rows, because the Shoeman, Smith, and O'Neil rows that returned by the previous step are all leaf nodes within this hierarchy, for which the PRIOR empid = mgrid condition evaluates to false.

Query execution ended, returning the four rows that are shown, where hierarchy is an alias for the path to Henry that SYS_CONNECT_BY_PATH( name,'/') returned for each row. (In the first returned row, the string /Henry shows the root status of Henry.)