Specifying NLSCASE case sensitivity

You can explicitly create a case-sensitive or case insensitive database.

By default, in databases where the locale classifies disjunct subsets of the code set as uppercase letters and as lowercase letters, Informix® databases are created as case-sensitive. The database locale is defined by setting the DB_LOCALE environment variable. An example of a locale whose code set recognizes letter case is the default US English locale, where lowercase letters precede uppercase letters in the ascending order of collation. In the default locale, the following statement creates a case-sensitive database:
CREATE DATABASE employees IN dbspaceYee WITH BUFFERED LOG; 
To explicitly create a case-sensitive database, include the NLSCASE SENSITIVE keywords as the last specification of the CREATE DATABASE statement, as in the following example:
CREATE DATABASE stores IN dbsp1 WITH LOG NLSCASE SENSITIVE; 
Because case sensitivity is enabled by default, the following statement has the same effect:
CREATE DATABASE stores IN dbsp1 WITH LOG; 

In a database that is case sensitive, for a table in which column col3 is of type NCHAR or NVARCHAR, for example, the Boolean condition col3 MATCHES 'SAM' evaluates as false for rows where 'Sam' is the value in col3. In contrast, in a different database created with the NLSCASE INSENSITIVE keyword option, the same Boolean condition col3 MATCHES 'SAM' evaluates as true for the same rows of the same table. As this example implies, for tables that include NCHAR or NVARCHAR columns in which some row values differ only in letter case, query results depend on the NLSCASE setting of the database.

All informix databases are case sensitive for operations on character strings of the built-in CHAR, LVARCHAR, and VARCHAR data types. If you create a case-sensitive database, whether by default, or explicitly with the NLSCASE SENSITIVE keywords, that database also treats strings of the National Language Support data types NCHAR and NVARCHAR as case-sensitive, if the database locale supports letter case.

Creating a database that is not case sensitive

In some applications, the letter case of character strings can be disregarded. Data entry procedures, for example, might accept the strings 'M' and 'm' as logically equivalent within a record. For large data sets, applying conditional logic to convert both case variants to a single value can result in slower performance than storing the records in an NCHAR or NVARCHAR column of a case-insensitive database, in which the strings 'M' and 'm' encode the same case-insensitive value. Here the condition 'M' MATCHES 'm' evaluates as true for NCHAR or NVARCHAR columns.

Every database created with the NLSCASE INSENSITIVE property stores uppercase and lowercase NCHAR and NVARCHAR letters exactly as they are loaded into their tables; any unmodified record that a query returns has its original lettercase. In all operations on NCHAR and NVARCHAR values, however, such as sorting, grouping, or identifying duplicate rows, the database server ignores any letter case variants, and treats, for example, the strings 'Mi' and 'mI' as the same value. Information about the case of letters not discarded, but it is also not used when the database server processes NLS data types.

When you include the NLSCASE INSENSITIVE keywords as the last specification of the CREATE DATABASE statement, the database server creates a database that always processes the following types of character strings without regard to letter case:

  • Strings stored in columns of the NLS data types NCHAR and NVARCHAR
  • Strings stored as a DISTINCT data type whose base type is NCHAR or NVARCHAR
  • Strings stored as elements of these data types within a collection data type
  • Strings stored in fields of the above data types in a named or unnamed ROW data type
  • Strings stored as SPL variables of these data types
  • Strings implicitly or explicitly cast to these data types
  • Strings returned by functions as output parameters of these data types.
Here "these data types" refers to the character data types that are identified in the same list.
The following statement creates a database with the NLSCASE property set to INSENSITIVE:
CREATE DATABASE stores IN dbsp2 WITH BUFFERED LOG NLSCASE INSENSITIVE;
Important: A database created as NLSCASE INSENSITIVE treats all other built-in character data types (CHAR, LVARCHAR, and VARCHAR) as case sensitive. That is, a case-insensitive database can also perform case-sensitive processing of string values, if their data types are not among the NLS character data types in the list above.

To perform case-sensitive operations on a string of the NCHAR or NVARCHAR data type in a case-insensitive database, you must first explicitly cast the string to a CHAR, LVARCHAR, or VARCHAR data type, and then perform the case-sensitive operation. (See, however, the topic Return Types from CONCAT and String Functions, which identifies contexts in which the database server automatically casts the result of built-in string-manipulation functions and string operators to NCHAR or NVARCHAR data types.)

Examples of NLSCASE INSENSITIVE queries

In a case-insensitive database, when a query calls an aggregate function or includes the GROUP BY clause for an NCHAR or NVARCHAR column, the database server treats letter-case variants in the data as duplicate column values, as in the following program fragment.

  CREATE DATABASE casedb WITH LOG NLSCASE INSENSITIVE;
  CREATE TABLE foo (cc CHAR(5), nc NCHAR(5));
  INSERT INTO foo VALUES ('IBM', 'iBM');
  INSERT INTO foo VALUES ('ibm', 'ibM');
  INSERT INTO foo VALUES ('ibm', 'ibM');
  INSERT INTO foo VALUES ('Ibm', 'Ibm');
	 
  SELECT COUNT(nc) FROM foo 
     GROUP BY nc;
  SELECT COUNT(nc) FROM foo 
     WHERE nc = 'ibm' GROUP BY nc;

In both of the queries above, the COUNT aggregate function returns 4, the total number of rows that the INSERT statements loaded into foo. Because column nc is an NLS data type, all of the rows satisfy the nc = 'ibm' condition in the WHERE clause, despite variations in letter case among the nc values.

For the following query on the same tables,
SELECT nc FROM foo GROUP BY nc;
the output can be any of the string values from the INSERT statements (namely 'IBM', 'iBM', 'ibm', 'ibM', or 'Ibm'), depending on the order in which the server processes or scans the rows.
The next query on the same table excludes duplicate rows from the result set by including the DISTINCT keyword in the projection clause:
SELECT DISTINCT nc FROM foo;
This returns a single row, because from the NLSCASE INSENSITIVE perspective, all of the rows have the same value, despite the variations in letter case. As in the previous example, the first row retrieved from among the inserted rows will be returned by the query.
The next example includes the DISTINCT keyword as one of the arguments to the COUNT aggregate function:
SELECT COUNT(DISTINCT nc) FROM foo;
This returns a count of 1, again because in this case-insensitive database, all of the rows in table foo evaluate as duplicates.

Restrictions on NLSCASE INSENSITIVE databases

The following restrictions apply to databases that are created with the NLSCASE INSENSITIVE property:
  • They support distributed cross-database and cross-server queries only with databases that also have the NLSCASE INSENSITIVE property.
  • Case-sensitive databases cannot connect to NLSCASE INSENSITIVE databases. Attempts to do so fail with this error:
    -26801 Cannot reference an external database that is not case sensitive.
  • NLSCASE INSENSITIVE databases cannot connect to case-sensitive databases. Attempts to do so fail with this error:
    -26802 Cannot reference an external database that is case sensitive.
    
    The only exception is that the NLSCASE setting does not prevent connections to the case-sensitive system databases, such as sysmaster, sysadmin, sysutils, sysusers, and syscdr, of the same Informix® database server instance. The results of operations that access a system database depends on the NLSCASE setting of the other database.
  • The onload and onunload utilities do not support databases that have the NLSCASE INSENSITIVE property.
  • In an Enterprise Replication cluster, no error or warning is issued if you specify a replication pair in whose databases differ in their NLSCASE property. To reduce the risk of data inconsistencies, replicate case-sensitive databases only with case-sensitive databases. and replicate NLSCASE INSENSITIVE databases only with NLSCASE INSENSITIVE databases.