| Prev | Next | Start of Chapter | End of Chapter | Contents | Glossary | Index | Comments | (7 out of 10)

Encoding Japanese Characters

The Gensym character set includes the characters defined in the Japanese Industrial Standard (or JIS) X 0208-1990 character set. G2 uses the correct ideograms to display symbol and text values that include these characters, regardless of whether the Japanese language facility is in use.

For information about the Japanese language facility, see Using the Japanese Language Facilities.

Each JIS character has a distinct representation in the Gensym character set. A JIS character is represented as a kanji code, which is a positive integer that can be represented in two bytes. The first byte contains the most significant eight bits of the kanji code's value.

Encoding for ASCII and Special Characters
Character Encoding Character Encoding Character Encoding




































To express a JIS character that is part of the Gensym character set:

The bit pattern of the two or three ASCII characters represents the value of the JIS character in the Gensym character set. Either the prefix ASCII character or the first ASCII character represents the most significant eight bits of the character in the Gensym character set.

To determine a kanji code's representation in the Gensym character set, you perform the following algorithm (expressed in pseudocode):

For example, to represent the kanji code 8504 (or 0x2138 hexadecimal), follow this sequence of steps:

  1. (0x2138 >> 13) is 1.

  2. The condition (1 is not equal to 1) is false, so derive no prefix ASCII character.

  3. (0x2138 & 0x1fff) is 312.

  4. (312 / 95) is 3.28, rounded down to 3.

  5. (3 + 40) is 43, or 0x2b hexadecimal, so the first ASCII character is the + (plus sign) character.

  6. (312 % 95) is 27.

  7. (27 + 32) is 59, or 0x3b hexadecimal, so the second ASCII character is the ; (semicolon) character.

Thus, encode the kanji code 8504 in the Gensym character set as this series of characters:

For example, to represent the kanji code 17228 (or 0x434c hexadecimal), follow this sequence of steps:

  1. (0x434c >> 13) is 2.

  2. The condition (2 is not equal to 1) is true, so derive a prefix ASCII character.

  3. 2 + 32 is 34, so the prefix ASCII character is the " (double quotes) character.

  4. (0x434c & 0x1fff) is 844.

  5. (844 / 95) is 8.88, rounded down to 8.

  6. (8 + 40) is 48, or 0x30 hexadecimal, so the first ASCII character is the 0 (zero digit) character.

  7. (844 % 95) is 84.

  8. (84 + 32) is 116, or 0x74 hexadecimal, so the second ASCII character is the t (lowercase T) character.

Thus, encode the kanji code 17228 in the Gensym character set as this series of characters:

| Prev | Next | Start of Chapter | End of Chapter | Contents | Glossary | Index | Comments | (7 out of 10)

Copyright © 1997 Gensym Corporation, Inc.