Jump to content

Character entity reference

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by UU (talk | contribs) at 19:28, 17 January 2008 (add decimal escape). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In the markup languages SGML, HTML, XHTML and XML, a character entity reference is a reference to a particular kind of named entity that has been predefined or explicitly declared in a Document Type Definition (DTD). The replacement text of the entity consists of a single character from the Universal Character Set/Unicode. The purpose of a character entity reference is to provide a way to refer to a character that is not universally encodable.

Actually, in XML at least, the term "character entity reference" is incorrect. XML has two relevant concepts: a "predefined entity reference" is a reference to one of the special characters denoted by <, >, &, ", or '; while a "character reference" (or "numeric character reference") is a construct such as   or   that refers to a character by means of its numeric Unicode codepoint. Although in popular usage character references are often called "entity references" or even "entities", this usage is wrong. A character reference is a reference to a character, not to an entity.

See also