ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined. (In practice, this small distinction is usually ignored.)
ISO-8859-11 is not a main registered IANA charset name despite following the normal pattern for IANA charsets based on the ISO 8859 series. However, it is defined as an alias[1] of the close equivalent TIS-620 (which lacks the non-breaking space), and which can without problems be used for ISO/IEC 8859-11, since the no-break space has a code which was unallocated in TIS-620. Microsoft has assigned code page 28601 a.k.a. Windows-28601 to ISO-8859-11 in Windows.[2] A draft had the Thai letters in different spots.[3]
As with all varieties of ISO/IEC 8859, the lower 128 codes are equivalent to ASCII. The additional characters, apart from no-break space, are found in Unicode in the same order, only shifted from 0xA1 to U+0E01 and so forth.
The Microsoft Windows code page 874 as well as the code page used in the Thai version of the Apple Macintosh, MacThai, are variants of TIS-620 — incompatible with each other, however.
Character set
ISO/IEC 8859-11[4] | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | ||||||||||||||||
1x | ||||||||||||||||
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | |
8x | ||||||||||||||||
9x | ||||||||||||||||
Ax | NBSP | ก | ข | ฃ | ค | ฅ | ฆ | ง | จ | ฉ | ช | ซ | ฌ | ญ | ฎ | ฏ |
Bx | ฐ | ฑ | ฒ | ณ | ด | ต | ถ | ท | ธ | น | บ | ป | ผ | ฝ | พ | ฟ |
Cx | ภ | ม | ย | ร | ฤ | ล | ฦ | ว | ศ | ษ | ส | ห | ฬ | อ | ฮ | ฯ |
Dx | ะ | ั | า | ำ | ิ | ี | ึ | ื | ุ | ู | ฺ | ฿ | ||||
Ex | เ | แ | โ | ใ | ไ | ๅ | ๆ | ็ | ่ | ้ | ๊ | ๋ | ์ | ํ | ๎ | ๏ |
Fx | ๐ | ๑ | ๒ | ๓ | ๔ | ๕ | ๖ | ๗ | ๘ | ๙ | ๚ | ๛ |
Code values D1, D4-DA, E7-EE are combining characters.
Vendor extensions
Code page 874 (IBM) / 9066
IBM code page 874 (CP874, IBM-874, x-IBM874), also known as Code page 9066 (IBM-9066),[5] differs from ISO/IEC 8859-11 in only nine symbols shown boxed in the following table:[6][7][8]
IBM code page 874/9066 (differences from ISO-8859-11)[9][10][11] | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
Ax | ่ | ก | ข | ฃ | ค | ฅ | ฆ | ง | จ | ฉ | ช | ซ | ฌ | ญ | ฎ | ฏ |
Bx | ฐ | ฑ | ฒ | ณ | ด | ต | ถ | ท | ธ | น | บ | ป | ผ | ฝ | พ | ฟ |
Cx | ภ | ม | ย | ร | ฤ | ล | ฦ | ว | ศ | ษ | ส | ห | ฬ | อ | ฮ | ฯ |
Dx | ะ | ั | า | ำ | ิ | ี | ึ | ื | ุ | ู | ฺ | ้ | ๊ | ๋ | ์ | ฿ |
Ex | เ | แ | โ | ใ | ไ | ๅ | ๆ | ็ | ่ | ้ | ๊ | ๋ | ์ | ํ | ๎ | ๏ |
Fx | ๐ | ๑ | ๒ | ๓ | ๔ | ๕ | ๖ | ๗ | ๘ | ๙ | ๚ | ๛ | ¢ | ¬ | ¦ | NBSP |
Code page 1161
Code page 1161 (CP1161, IBM-1161), is a variant of IBM code page 874. The only difference is the euro sign (€) in position DEhex (222).[12][13]
Code page 874 (Microsoft) / 1162
Windows code page 874 (windows-874, MS874, x-windows-874), known as Code page 1162 (CP1162, IBM-1162) by IBM,[14][15] is used by Microsoft Windows. It differs from ISO/IEC 8859-11 only by adding the nine symbols shown in the following table:
Code page 1162 (IBM) / 874 (Microsoft): difference from ISO-8859-11[16][17][18][19] | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
8x | € | … | ||||||||||||||
9x | ‘ | ’ | “ | ” | • | – | — |
Mac OS Thai
This is the variant used on the Classic Mac OS.
Mac OS Thai[20] | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
8x | « | » | … | ่ | ้ | ๊ | ๋ | ์ | ่ | ้ | ๊ | ๋ | ์ | “ | ” | ํ |
9x | • | ั | ็ | ิ | ี | ึ | ื | ่ | ้ | ๊ | ๋ | ์ | ‘ | ’ | ||
Ax | NBSP | ก | ข | ฃ | ค | ฅ | ฆ | ง | จ | ฉ | ช | ซ | ฌ | ญ | ฎ | ฏ |
Bx | ฐ | ฑ | ฒ | ณ | ด | ต | ถ | ท | ธ | น | บ | ป | ผ | ฝ | พ | ฟ |
Cx | ภ | ม | ย | ร | ฤ | ล | ฦ | ว | ศ | ษ | ส | ห | ฬ | อ | ฮ | ฯ |
Dx | ะ | ั | า | ำ | ิ | ี | ึ | ื | ุ | ู | ฺ | WJ | ZWSP | – | — | ฿ |
Ex | เ | แ | โ | ใ | ไ | ๅ | ๆ | ็ | ่ | ้ | ๊ | ๋ | ์ | ํ | ™ | ๏ |
Fx | ๐ | ๑ | ๒ | ๓ | ๔ | ๕ | ๖ | ๗ | ๘ | ๙ | ® | © |
See also
Footnotes
References
- ↑ "IANA Character Sets".
- ↑ "js-codepage, Getting codepages". GitHub. 12 October 2021.
- ↑ Everson, Michael. "Proposed ISO 8859-11".
- ↑ Whistler, Ken (2002-10-07), ISO/IEC 8859-11:2001 to Unicode, Unicode Consortium
- ↑ IBM; Unicode Consortium. "convrtrs.txt". International Components for Unicode. v. 59180.0.1.
Yes ibm-874 == ibm-9066. ibm-1161 has the euro update.
- ↑ "Code page 874 information document". Archived from the original on 2017-01-16.
- ↑ "CCSID 874 information document". Archived from the original on 2016-03-27.
- ↑ "CCSID 9066 information document". Archived from the original on 2016-03-27.
- ↑ IBM. "Code Page CPGID 00874" (PDF). REGISTRY: Graphic Character Sets and Code Pages.
- ↑ Code Page CPGID 00874 (txt), IBM
- ↑ "Converter Explorer: ibm-874_P100-1995". International Components for Unicode. Unicode Consortium.
- ↑ "Code Page 01161" (PDF).
- ↑ "CCSID 1161 information document". Archived from the original on 2016-03-27.
- ↑ "Code page 1162 information document". Archived from the original on 2016-03-17.
- ↑ "CCSID 1162 information document". Archived from the original on 2016-03-27.
- ↑ "Code Page 01162" (PDF).
- ↑ Steele, Shawn (1998-02-28). "cp874 to Unicode table". Unicode Consortium, Microsoft.
- ↑ Code Page CPGID 01162 (txt), IBM
- ↑ International Components for Unicode (ICU), ibm-1162_P100-1999.ucm, 2002-12-03
- ↑ Apple (2005-04-05). "Map (external version) from Mac OS Thai character set to Unicode 3.2 and later". Unicode Consortium.
External links
- ISO/IEC 8859-11:2001
- ISO/IEC 8859-11:1999 - 8-bit single-byte coded graphic character sets, Part 11: Latin/Thai character set (draft dated June 22, 1999; superseded by ISO/IEC 8859-11:2001, published December 15, 2001)
- Windows code page 874
- ISO-IR 166 Thai character set (July 13, 1992, from Thai Standard TIS 620-2533 (1990))
- Standardization and Implementations of Thai Language PDF 175k