Module talk:BaseConvert
Fringe feature
About possible hexadecimal input. Unicode defines that the regular characters "0...9A...F" can be recognised as hexadecimal characters. There are no separate characters defined for hex notation. All fine. Now there are more characters that, according to Unicode, may be recognised as hexadecimal. They are socalled Fullwidth form characters. (see here and here).
Characters in Unicode marked Hex_Digit=Yes [a]
| |||
---|---|---|---|
0123456789ABCDEF |
Basic Latin, capitals | Also ASCII_Hex_Digit=Yes
| |
0123456789abcdef |
Basic Latin, small letters | Also ASCII_Hex_Digit=Yes
| |
0123456789ABCDEF |
Fullwidth forms, capitals | ||
0123456789abcdef |
Fullwidth forms, small letters | ||
a. ^ "Unicode 16.0 UCD: PropList.txt". 2024-05-31. Retrieved 2024-09-13. |
Their code points are U+FF10 .. U+FF19 (numbers 0-9), U+FF21 .. U+FF26 (A-F), U+FF41 .. U+FF46 (a-f). So, generally spoken, there could be hex input by this regular Unicode input. We could decide that this input should be recognised (and so converted in this template).
When or where does that Halfwidth input occur? The characters are pre-Unicode glyphs. If I understand it well, they are used in East Asian (CJK) texts, possibly for Western quotes (including Western numbers). So including them would make BaseConvert a more full generic template, especially when exporting the module from enwiki. -DePiep (talk) 21:43, 24 February 2013 (UTC)
- Ok, makes sense that this could be useful if this module is exported to other language wikis. I've added normalization of full-width chars. Toohool (talk) 00:11, 25 February 2013 (UTC)