Perform code-set conversion
In a client/server environment, character data might need to be converted from one code set to another if the client or server computer uses different code sets to represent the same characters. The conversion of character data from one code set (the source code set) to another (the target code set) is called code-set conversion.
Without code-set conversion, one computer cannot correctly process or show character data that originates on the other (when the two computers use different code sets).
HCL OneDB™ products use GLS locales to perform code-set conversion. Both the HCL® OneDB client application and a database server might perform code-set conversion. For details, see Database server code-set conversion and Client application code-set conversion.
- The client application uses the client code set, which the CLIENT_LOCALE environment variable specifies, to write all files on the client computer and to interact with all client I/O devices.
- The database server uses the database code set, which the DB_LOCALE environment variable specifies, to transfer data to and from the database.
- The database server uses the server code set, which the SERVER_LOCALE environment variable specifies, to write files (such as debug and warning files).
- Code-set conversion is not a semantic translation.
It does not convert between words in different languages. For example, it does not convert from the English word
yes
to the French wordoui
. It only ensures that each character retains its meaning when it is processed or written, regardless of how it is encoded. - Code-set conversion does not create a character in the target code set if it exists only in the
source code set.
For example, if the character
ac
is passed to a target computer whose code set does not contain that character, the target computer cannot process or print the character exactly.
For each character in the source code set, a corresponding character in the target code set should exist. However, if the source code set contains characters that are not in the target code set, the conversion must then define how to map these mismatched characters to the target code set. (Absence of a mapping between a character in the source and target code sets is often called a lossy error.) If all characters in the source code set exist in the target code set, mismatch handling does not apply.
- Round-trip conversion
- This method maps each mismatched character to a unique character in the target code set so that the return mapping maps the original character back to itself. This method guarantees that a two-way conversion results in no loss of information; however, data that is converted just one way might prevent correct processing or printing on the target computer.
- Substitution conversion
- This method maps all mismatched characters to one character in the target code set that highlights mismatched characters. This method guarantees that a one-way conversion clearly shows the mismatched characters; however, a two-way conversion results in loss of information if mismatched characters are present.
- Graphical-replacement conversion
- This method maps each mismatched character to a character in the
target code set that looks like the source character.
This method includes the mapping of one-character ligatures to their two-character equivalents and vice versa, to make printing of mismatched data more accurate on the target computer, but it most likely confuses the processing of this data on the target computer.
- A hybrid of two or three of the preceding conversion methods