Jump to content

Code page 936 (Microsoft Windows)

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by HarJIT (talk | contribs) at 16:58, 20 December 2017. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Windows Code page 936 (CP936, Windows-936), is Microsoft's character encoding for simplified Chinese, one of the four DBCSs for East Asian languages. Originally, Windows-936 was identical to GB 2312, but it was expanded to cover most part of GBK with the release of Windows 95, a version dubbed Code page 1386 (CP1386) by IBM.

It was superseded by code page 54936 (GB 18030), but as of 2014 was still prevalent in use. The Windows command prompt uses CP936 as the default code page for simplified Chinese installations, although part of the GB 18030 was made mandatory for all software products sold in China. In 2002, the IANA Internet name GBK was registered with CP1386's mapping,[1] making it the de facto GBK definition on the Internet. It is a combination of Code page 1114 and Code page 1385.

The concepts of "CP936" (in the sense of "CP1386"), "GBK"[a] and "GB2312" are sometimes confused in various software products. Code page 1386 is not identical to GBK because a code page encodes characters while the GBK only defines code points. In addition, the Euro sign (€), encoded as 0x80 in CP1386, is not defined in GBK. On the other hand, 95 characters defined in GBK were initially not encoded into CP1386.

This is partly resolved in later versions of Windows and, as in Windows 7, all GBK characters not in the Unicode BMP Private Use Area can be displayed using code page 1386, but encoding the 95 characters was still not supported as of 2014. However, "CP936" and "GBK" are often used interchangeably because of the popularity of Microsoft products on the Chinese market when GBK was then published. Since GBK superseded GB2312 long ago, these two terms have also become virtually equivalent to many users, so "CP1386", "GBK" and "GB2312" are misunderstood by many to mean the same thing while they actually differ significantly. Instead of supporting precisely GB2312, most modern-day software products mean partial support for GBK using CP1386 when they use the term "GB2312" as a character encoding option. This can be observed in products such as Microsoft Internet Explorer and Notepad++.

Notes

  1. ^ GBK 1.0

References

  1. ^ "Character Sets". Retrieved 3 October 2016.