Unicode character property
Unicode assigns character properties to each codepoint.[1] The properties can be used to handle characters in processing texts, like line-breaking, script direction right-to-left, or script naming. Slightly inconsequently, some character properties are also defined for codepoints that have no character assigned, and codepoints that are defined "not a character", etc.
Properties has level of forcefulness: normative, informative, contributory, or provisional. Technically a property may be assigned by specifying a range of codepoints.
The character properties are in these topics[1]:
- Name
- General Category
- Other important general characteristics
- Display-related properties (bidirectional class, shaping, mirroring, width, and so on)
- Casing (upper, lower, title, folding—both simple and full)
- Numeric values and types
- Script and Block
- Normalization properties (decompositions, decomposition type, canonical combining class, composition exclusions, and so on)
- Age (version of the standard in which the code point was first designated)
- Boundaries (grapheme cluster, word, line, and sentence)
General Category
Each codepoint is assigned a value as for its General Category. This is one of the character properties that are also defined unassigned codepoints, and codepoints that are defined "not a character".
Template:Unicode property General Category