Supported vendor database character sets
Each HCL Compass data code page has a corresponding character set for each supported vendor database: Oracle, DB2®, and SQL Server.
To help you choose an appropriate character set for your vendor database, Table 1 lists the supported Compass data code page values and their corresponding vendor database character set values.
HCL Compass supports the UTF-8 (8-bit Unicode Transformation Format) code page 65001. This support is limited to creating new database sets. Compass does not support converting an existing database set to the UTF-8 code page.
Compass supports the UTF-8 (8-bit Unicode Transformation Format) code page 65001.
For instructions about how to set the character set for your vendor database, see the vendor database documentation.
HCL Compass data code page | Oracle character set | DB2 code set | SQL Server collation |
---|---|---|---|
932 (Japanese) | JA16SJISTILDE - See Code page 932 (Japanese) on Oracle | IBM-943 (943) - See Code page 932 (Japanese) on DB2 | Japanese_* |
936 (Simplified Chinese) | ZHS16GBK Limited support. See Code page 936 (Simplified Chinese) on Oracle | GBK (1386) | Chinese_PRC_* |
949 (Korean) | KO16MSWIN949 | 1363 | Korean_Wangsung_* |
950 (Traditional Chinese) | ZHT16MSWIN950 | big5 (950) | Chinese_Taiwan_Bopomofo_* |
1250 (Eastern Europe) | EE8MSWIN1250 | 1250 | Romanian_* |
1251 (Cyrillic) | CL8MSWIN1251 | 1251 | Cyrillic_General_* |
1252 (Western Europe) | WE8MSWIN1252 | 1252 | Latin1_General_* |
1253 (Greek) | EL8MSWIN1253 | 1253 | Greek_* |
1254 (Turkish) | TR8MSWIN1254 | 1254 | Turkish_* |
1255 (Hebrew) | IW8MSWIN1255 | 1255 | Hebrew_* |
1257 (Baltic) | BLT8MSWIN1257 | 1257 | Estonian_* |
20127 (ASCII) | Any | Any | Any |
60932 (Safe Shift-JIS) | JA16EUC | eucJP (954) | N/A |
65001 (UTF-8) | AL32UTF8 See code page 65001 (UTF-8) | UTF-8 (1208) See code page 65001 (UTF-8) | N/A |
Note: For Microsoft™ Access databases,
you do not need to set the vendor database code page.
|
Code page 932 (Japanese) on Oracle
JA16SJISTILDE is the recommended vendor database character set 932 for Japanese SJIS data on Oracle. This is a change from the recommendation of JA16SJIS for versions of HCL Compass earlier than 7.0. The character sets JA16SJIS and JA16SJISTILDE are the same except for the way that the wave dash and the tilde are mapped to and from Unicode. Because HCL Compass versions 7.0 and later use Unicode to communicate with the database, it is necessary to use the JA16SJISTILDE character set. For information on how to convert an existing Oracle database from JA16SJIS to JA16SJISTILDE, see the Oracle documentation.
Code page 932 (Japanese) on DB2
IBM-943 is the recommended code set for Japanese SJIS data on DB2. You must configure the database management system to use the conversion table that is compatible with the Microsoft definition of code page 932. If this alternate character set is not used, you cannot set the Compass data code page to 932 for new schemas. Also, if you do not convert an existing DB2 database set to use the alternate conversion table, some characters in the 932 character set will be corrupted. See Alternative Unicode conversion table for CCSID 943.
Code page 936 (Simplified Chinese) on Oracle
- If you are using the installutil setdbcodepage command, then you must use the allowconversion option. This command lets you set the Compass data code page value to 936 even though the validation for the euro character fails.
- You cannot use the euro character in your data. If you use this character, it is stored as a replacement character in the database, effectively corrupting it.
- If your deployment uses HCL Compass MultiSite, use Oracle databases that are configured identically with ZHS16GBK for every database in the clan. If you mix vendor databases throughout the clan and a euro character is entered, data divergence occurs because databases other than Oracle can store the euro, while Oracle databases store the euro as a replacement character.
Code page 65001 (UTF-8) on Oracle and DB2
- Code page 65001 is not supported for the SQL Server database because SQL Server does not provide support for UTF-8 character encoding.
- The maximum character-string length is reduced for many MBCS code pages. Code page 65001 (UTF-8) might reduce the number of characters stored in a string to one third as compared with an ASCII character string. The reduction depends on the mixture of one-byte, two-byte, or three-byte characters that are stored in the string. (Compass also supports double-byte character sets [DBCS]. With DBCS code pages, a reduction of up to half is possible as compared with an ASCII character string.)
- Compass does not support converting an existing Compass database set to use the new 65001 code page.
- Compass does not support bidirectional or complex script languages to use with the 65001 code page.