Jump to content

Character entity reference

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 217.169.121.9 (talk) at 13:38, 25 February 2008 (Typographical updates). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In the markup languages SGML, HTML, XHTML and XML, a character entity reference is a reference to a particular kind of named entity that has been predefined or explicitly declared in a Document Type Definition (DTD). The replacement text of the entity consists of a single character from the Universal Character Set/Unicode. The purpose of a character entity reference is to provide a way to refer to a character that is not universally encodable.

Actually, in XML at least, the term "character entity reference" is incorrect. XML has two relevant concepts:

  • a "predefined entity reference" is a reference to one of the special characters denoted by <, >, &, ", or ';
  • while a "character reference" (or "numeric character reference") is a construct such as   or   that refers to a character by means of its numeric Unicode codepoint.

Although in popular usage character references are often called "entity references" or even "entities", this usage is wrong. A character reference is a reference to a character, not to an entity.

See also