MIME / IANA | IBM852 |
---|---|
Alias(es) | cp852, 852, csPCp852[1] |
Language(s) | Gaj's Latin alphabet (Bosnian, Croatian, Serbian), Slovene, Czech, Slovak, Polish, Romanian, Hungarian |
Classification | OEM code page, extended ASCII |
Based on | OEM 850 (DOS-Latin 1), OEM 437 (OEM-US) |
Transforms / Encodes | ISO/IEC 8859-2 (reordered) |
Code page 852 (CCSID 852) (also known as CP 852, IBM 00852, OEM 852 (Latin II),[2][3] MS-DOS Latin 2[4]) is a code page used under DOS to write Central European languages that use Latin script (such as Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian, Slovak or Slovene).[5]
CCSID 9044 is the euro currency update of code page/CCSID 852.[6] Byte AA replaces ¬ with € in that update.[7][8]
Code page 852 (DOS Latin 2) is very different from ISO/IEC 8859-2 (ISO Latin-2), although both are informally referred to as "Latin-2" in different language regions.[9] However, all printable characters from ISO 8859-2 are included, in a different arrangement which preserves a subset of the box-drawing characters of the original DOS code page 437, while sacrificing others (those combining both single and double lining) in order to include more letters with diacritics. This is the same approach taken by code page 850, the equivalent for ISO 8859-1.
This reduced box-drawing support caused display glitches in DOS applications that made use of the box-drawing characters to display a GUI-like surface in text mode (e.g. Norton Commander). Several local, more language-specific encodings were invented to avoid the problem, for example the Kamenický encoding for Czech and Slovak[10] or the Mazovia encoding for Polish.
Character set
The following table shows code page 852.[2][11] Each character is shown with its equivalent Unicode code point. Only the second half of the table (128–255) is shown, the first half (0–127) being the same as code page 437.
Code page 852[4][7][8][12] | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
8x | Ç | ü | é | â | ä | ů | ć | ç | ł | ë | Ő | ő | î | Ź | Ä | Ć |
9x | É | Ĺ | ĺ | ô | ö | Ľ | ľ | Ś | ś | Ö | Ü | Ť | ť | Ł | × | č |
Ax | á | í | ó | ú | Ą | ą | Ž | ž | Ę | ę | ¬ | ź | Č | ş | « | » |
Bx | ░ | ▒ | ▓ | │ | ┤ | Á | Â | Ě | Ş | ╣ | ║ | ╗ | ╝ | Ż | ż | ┐ |
Cx | └ | ┴ | ┬ | ├ | ─ | ┼ | Ă | ă | ╚ | ╔ | ╩ | ╦ | ╠ | ═ | ╬ | ¤ |
Dx | đ | Đ | Ď | Ë | ď | Ň | Í | Î | ě | ┘ | ┌ | █ | ▄ | Ţ | Ů | ▀ |
Ex | Ó | ß | Ô | Ń | ń | ň | Š | š | Ŕ | Ú | ŕ | Ű | ý | Ý | ţ | ´ |
Fx | SHY | ˝ | ˛ | ˇ | ˘ | § | ÷ | ¸ | ° | ¨ | ˙ | ű | Ř | ř | ■ | NBSP |
See also
References
- ↑ Character Sets, Internet Assigned Numbers Authority (IANA), 2018-12-12
- 1 2 "OEM 852". Go Global Developer Center. Microsoft. Retrieved 11 Nov 2011.
- ↑ "Code Pages Supported by Windows: OEM Code Pages". Go Global Developer Center. Microsoft. Archived from the original on 2 November 2011. Retrieved 11 Oct 2011.
- 1 2 "Code Page 852 DOS Latin 2". Developing International Software. Microsoft. Retrieved 11 Nov 2011.
- ↑ "CCSID 852 information document". Archived from the original on 2016-03-27.
- ↑ "CCSID 9044 information document". Archived from the original on 2016-03-27.
- 1 2 Code Page CPGID 00852 (pdf) (PDF), IBM
- 1 2 Code Page CPGID 00852 (txt), IBM
- ↑ "The Czech and Slovak Character Encoding Mess Explained". luki.sdf-eu.org. Retrieved 2022-02-27.
- ↑ The Czech and Slovak Character Encoding Mess Explained / Kamenicky
- ↑ "cp852_DOSLatin2 to Unicode table" (TXT). The Unicode Consortium. Retrieved 11 Nov 2011.
- ↑ International Components for Unicode (ICU), ibm-852_P100-1995.ucm, 2002-12-03