Draft:Digital alphabet
![]() | Review waiting, please be patient.
This may take 3 months or more, since drafts are reviewed in no specific order. There are 3,154 pending submissions waiting for review.
Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
Reviewer tools
|
Digital alphabet is a term used in information theory, telecommunication, and computer science for any finite, discrete set of symbols chosen to represent information in a machine-readable form. Typical examples range from the binary alphabet {0, 1}
that underpins modern electronics to large character repertoires such as Unicode. Digital alphabets make it possible to encode, transmit, store, and decode messages reliably because every symbol is unambiguously distinguishable from every other.[1][2]
Definition and scope
[edit]A digital alphabet can be formalised as a finite alphabet used by a discrete source.Claude Shannon’s 1948 paper defined a noiseless source as one that “chooses successively from a set of symbols” that constitute an alphabet.[3]
Although the binary alphabet of two symbols is the simplest and today the most widespread, historical and contemporary systems employ alphabets of many sizes—for example the 5-bit Baudot code’s 32 symbols, ASCII’s original 128, or Unicode’s >149 000 characters.[4][5][6]
Historical development
[edit]Early telegraphy
[edit]- Baudot code (1870 – 1874) replaced variable-length Morse symbols with fixed-length five-unit patterns, laying the groundwork for later digital codes.[7][8]
- The Baudot family evolved into the International Telegraph Alphabet No. 2 (ITA 2) and remained in telex use until the 1960s.[9]
ASCII and the early computer era
[edit]In 1963 the American Standards Association adopted ASCII, a 7-bit, 128-symbol digital alphabet designed for English-language data exchange between computers and peripherals.[10] ASCII’s fixed length and simple parity bit made it attractive for early serial links, but it proved inadequate for multilingual computing.[5]
Unicode and universal character sets
[edit]Unicode (first published 1991) provides a single digital alphabet intended to encode every writing system. As of version 16.0 it contains more than 149 000 characters across 168 scripts, plus numerous symbols and emojis.[11] Variable-length UTF-8, UTF-16, and UTF-32 encodings preserve backward compatibility with ASCII while permitting much larger alphabets.[6]
Theoretical properties
[edit]Alphabet size and information content
[edit]Shannon showed that the maximum information conveyed per symbol (entropy H) depends on both symbol probabilities and alphabet size. For a binary alphabet the maximum entropy is 1 bit per symbol; generalising to an n-ary alphabet yields bits per perfectly random symbol.[12]
Binary alphabet
[edit]Research highlights the “curious case of the binary alphabet”: certain coding-theory and privacy results that hold for take different forms when .[13]
Redundancy and error control
[edit]By adding check symbols from the same alphabet (parity bits, CRCs), a code can detect or correct errors introduced in a noisy channel, trading redundancy for reliability—an idea central to modern channel coding.[12]
Applications
[edit]Domain | Role of the digital alphabet | Typical alphabet | Example |
---|---|---|---|
Telecommunications | Serial-line encoding, character framing | ASCII, ITA 2 | Telex, RS-232 |
Data storage | Magneto-electric patterns for bytes | Binary 8-bit | SSD, HDD |
Internet protocols | Packet payload text | UTF-8 (Unicode) | HTTP headers, JSON |
Optical & radio links | Modulation symbols (QAM, PSK) | Binary, quaternary, 256-QAM symbols | Wi-Fi, LTE |
Synthetic biology | Expanded DNA alphabets for data storage or biotechnology | A,C,G,T + artificial bases (e.g. P,Z) | Six-letter DNA aptamers[14] |
Relation to other concepts
[edit]- Alphabet in formal languages
- Character encoding
- Source coding and data compression
- Channel coding and error-correcting codes
- Barcode and QR code, visual digital alphabets
See also
[edit]References
[edit]- ^ Selbsterklärende Codes: Papier und Digitale Codierung (Report). Universität Heidelberg. 2016. Retrieved 2025-05-13.
- ^ Cairncross, Frances (2001). The Death of Distance 2.0. Harvard Business School Press. ISBN 978-1-591-39098-6.
{{cite book}}
: Check|isbn=
value: checksum (help) - ^ Shannon, Claude E. (1948). "A Mathematical Theory of Communication". Bell System Technical Journal. 27 (3): 379–423. doi:10.1002/j.1538-7305.1948.tb01338.x.
- ^ "Baudot Code". Encyclopædia Britannica (online). 2025. Retrieved 2025-05-13.
- ^ a b "American Standard Code for Information Interchange (ASCII)". Investopedia. 2024-02-15. Retrieved 2025-05-13.
- ^ a b "The Unicode Standard – Technical Introduction". Unicode Consortium. 2023-09-12. Retrieved 2025-05-13.
- ^ "Émile Baudot Invents the Baudot Code". History of Information. 2024-04-11. Retrieved 2025-05-13.
- ^ Weisberger, Richard (2017-09-07). "The Roots of Computer Code Lie in Telegraph Code". Smithsonian Magazine. Retrieved 2025-05-13.
- ^ "International Telegraph Alphabet No. 2 (ITA 2)". International Telecommunication Union. Retrieved 2025-05-13.
- ^ "Breaking the Language Barrier". Wired. 1993-10-15. Retrieved 2025-05-13.
- ^ "Emoji Counts v16.0". Unicode Consortium. 2024-09-12. Retrieved 2025-05-13.
- ^ a b "Information Theory Lecture Notes" (PDF). University of Auckland. 2004. Retrieved 2025-05-13.
- ^ Jiao, Jiantao; et al. (2015). "Information Measures: The Curious Case of the Binary Alphabet". IEEE Transactions on Information Theory. 61 (2): 779–800. arXiv:1401.6060. doi:10.1109/TIT.2014.2368555.
- ^ "Chemists Invent New Letters for Nature's Genetic Alphabet". Wired. 2015-04-07. Retrieved 2025-05-13.