Legacy encoding

In computing, a legacy encoding is a character encoding that can't represent all of Unicode, but is still used for compatibility or other reasons. Legacy encodings include national, international and vendor encoding standards. ^[1]

Many legacy encodings predate Unicode, while others are slight modifications to older encodings to support important new characters such as the euro sign (€) or to satisfy countries that felt there were significant omissions for their language. The best known such encoding is probably ISO-8859-15.

Legacy encodings are numerous, and include the following major groups:

The ISO-8859-n group of single byte encodings
The IBM/DOS/Windows OEM series of single byte code pages (437, 850, and others).
The single-byte Windows "ANSI" code pages (125x)
The windows multibyte code pages used by windows as both ANSI and OEM code pages for CJK languages.
Various other multibyte CJK encodings such as ISO-2022 and EUC.

References

^ "Processing database information using Unicode, a case study", IBM developmentWorks, 1 September 1999

References

See also