Jump to content

Template:Bidi Class (Unicode)

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by DePiep (talk | contribs) at 00:21, 13 January 2011 (table in template. (also known as "bidi character type")). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)
Bidirectional character type (Unicode character property Bidi_Class)[1] 
Type[2] Description Strong/Weak effect General scope Bidi_Control character[3]
SMS 01L Left-to-Right Strong Most alphabetic and syllabic characters, Han ideographs, non-European or non-Arabic digits, LRM character, ... U+200E LEFT-TO-RIGHT MARK (LRM)
SMS 02LRE Left-to-Right Embedding Strong LRE character only U+202A LEFT-TO-RIGHT EMBEDDING (LRE)[2]
SMS 03LRO Left-to-Right Override Strong LRO character only U+202D LEFT-TO-RIGHT OVERRIDE (LRO)
SMS 04R Right-to-Left Strong Hebrew alphabet and related punctuation, RLM character U+200F RIGHT-TO-LEFT MARK (RLM)
SMS 05AL Right-to-Left Arabic Strong Arabic, Thaana and Syriac alphabets, and most punctuation specific to those scripts
SMS 06RLE Right-to-Left Embedding Strong RLE character only U+202B RIGHT-TO-LEFT EMBEDDING (RLE)
SMS 07RLO Right-to-Left Override Strong RLO character only U+202E RIGHT-TO-LEFT OVERRIDE (RLO)
SMS 08PDF Pop Directional Format Weak PDF character only U+202C POP DIRECTIONAL FORMATTING (PDF)
SMS 09EN European Number Weak European digits, Eastern Arabic-Indic digits, ...
SMS 10ES European Separator Weak plus sign, minus sign, ...
SMS 11ET European Number Terminator Weak degree sign, currency symbols, ...
SMS 12AN Arabic Number Weak Arabic-Indic digits, Arabic decimal and thousands separators, ...
SMS 13CS Common Number Separator Weak colon, comma, full stop, no-break space, ...
SMS 14NSM Nonspacing Mark Weak Characters in General Categories Mark, nonspacing and Mark, enclosing (Mn, Me)
SMS 15BN Boundary Neutral Weak Default ignorables, non-characters, control characters other than those explicitly given other types
SMS 16B Paragraph Separator Neutral paragraph separator, appropriate Newline Functions, higher-level protocol paragraph determination
SMS 17S Segment Separator Neutral Tab
SMS 18WS Whitespace Neutral space, figure space, line separator, form feed, General Punctuation block spaces
SMS 19ON Other Neutrals Neutral All other characters, including object replacement character
Notes
1. ^ Unicode Bidirectional Algorithm (UAX#9), As of version 6.0.0
2.^ Possible Bidirectional character types for character property: Bidi_Class or 'type'
3.^ Bidi_Control characters: Seven Bidi_Control formatting characters are defined. They are invisible, and have no effect apart from directionality. Five of them have a unique, overruling Bidi-type that is used by the algorithm; their type is also their acronym (e.g. character 'LRE' has Bidi type 'LRE').