Jump to content

Collating sequence

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Miker@sundialservices.com (talk | contribs) at 12:56, 25 July 2006. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

The term collating sequence refers to the order in which character strings should be placed when sorting them.

A common example is the familiar "alphabetic order," in which "Alfred" occurs before "Zeus" because "A" occurs before "Z" in the English alphabet. But there are other issues that a collating sequence must consider, say in a computer system.

  • Upper and Lower-Case: Should "Alfred" be placed before or after "alfred"? Generally one would say "no," because an upper-case "A" and a lower-case "a" are usually considered to be the same letter. But it may be that you want to sort the records otherwise.
  • National characters, accents, tildes: Various languages use these marks over and around letters, but once again the speakers of the language might consider the characters to be "the same."

In a computer system, each letter is necessarily assigned a unique numeric code (as in the ASCII or Unicode character set), but the proper and customary ordering of strings is not performed by a simple numeric comparison of those codes. Rather, the ordering is determined by reference to the collating sequence.