Draft:Digital alphabet

Review waiting, please be patient.

This may take 3 months or more, since drafts are reviewed in no specific order. There are 3,154 pending submissions waiting for review.

If the submission is accepted, then this page will be moved into the article space.
If the submission is declined, then the reason will be posted here.
In the meantime, you can continue to improve this submission by editing normally.

Where to get help

If you need help editing or submitting your draft, please ask us a question at the AfC Help Desk or get live help from experienced editors. These venues are only for help with editing and the submission process, not to get reviews.
If you need feedback on your draft, or if the review is taking a lot of time, you can try asking for help on the talk page of a relevant WikiProject. Some WikiProjects are more active than others so a speedy reply is not guaranteed.

How to improve a draft

Wikipedia:Contributing to Wikipedia – a basic overview on how to edit Wikipedia.
Help:Wikitext – how to use the markup
Help:Referencing for beginners – how to include references
Wikipedia:Article development – how to develop your article
Wikipedia:Writing better articles – how to improve your article
Wikipedia:Verifiability – make sure your article includes reliable third-party sources

You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article.

Improving your odds of a speedy review

To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags.

Add tags to your draft

Editor resources

Easy tools: Citation bot (help) | Advanced: Fix bare URLs

Reviewer tools

Instructions · What links here · Digital alphabet (talk: + · bio) · (log) · Copyvios report · reFill · Citation Bot · (Search: Google, Wikipedia) · Submitted 4 days ago by Zoz123 (talk: D · +) · Last edited 4 days ago by Zoz123

Digital alphabet is a term used in information theory, telecommunication, and computer science for any finite, discrete set of symbols chosen to represent information in a machine-readable form. Typical examples range from the binary alphabet {0, 1} that underpins modern electronics to large character repertoires such as Unicode. Digital alphabets make it possible to encode, transmit, store, and decode messages reliably because every symbol is unambiguously distinguishable from every other.^[1]^[2]

Definition and scope

A digital alphabet can be formalised as a finite alphabet $\Sigma =\{\sigma _{1},\sigma _{2},\ldots ,\sigma _{n}\}$ used by a discrete source.Claude Shannon’s 1948 paper defined a noiseless source as one that “chooses successively from a set of symbols” that constitute an alphabet.^[3]

Although the binary alphabet of two symbols is the simplest and today the most widespread, historical and contemporary systems employ alphabets of many sizes—for example the 5-bit Baudot code’s 32 symbols, ASCII’s original 128, or Unicode’s >149 000 characters.^[4]^[5]^[6]

Historical development

Early telegraphy

Baudot code (1870 – 1874) replaced variable-length Morse symbols with fixed-length five-unit patterns, laying the groundwork for later digital codes.^[7]^[8]
The Baudot family evolved into the International Telegraph Alphabet No. 2 (ITA 2) and remained in telex use until the 1960s.^[9]

ASCII and the early computer era

In 1963 the American Standards Association adopted ASCII, a 7-bit, 128-symbol digital alphabet designed for English-language data exchange between computers and peripherals.^[10] ASCII’s fixed length and simple parity bit made it attractive for early serial links, but it proved inadequate for multilingual computing.^[5]

Unicode and universal character sets

Unicode (first published 1991) provides a single digital alphabet intended to encode every writing system. As of version 16.0 it contains more than 149 000 characters across 168 scripts, plus numerous symbols and emojis.^[11] Variable-length UTF-8, UTF-16, and UTF-32 encodings preserve backward compatibility with ASCII while permitting much larger alphabets.^[6]

Theoretical properties

Alphabet size and information content

Shannon showed that the maximum information conveyed per symbol (entropy H) depends on both symbol probabilities and alphabet size. For a binary alphabet the maximum entropy is 1 bit per symbol; generalising to an n-ary alphabet yields $\log _{2}n$ bits per perfectly random symbol.^[12]

Binary alphabet

Research highlights the “curious case of the binary alphabet”: certain coding-theory and privacy results that hold for $n\geq 3$ take different forms when $n=2$ .^[13]

Redundancy and error control

By adding check symbols from the same alphabet (parity bits, CRCs), a code can detect or correct errors introduced in a noisy channel, trading redundancy for reliability—an idea central to modern channel coding.^[12]

Applications

Domain	Role of the digital alphabet	Typical alphabet	Example
Telecommunications	Serial-line encoding, character framing	ASCII, ITA 2	Telex, RS-232
Data storage	Magneto-electric patterns for bytes	Binary 8-bit	SSD, HDD
Internet protocols	Packet payload text	UTF-8 (Unicode)	HTTP headers, JSON
Optical & radio links	Modulation symbols (QAM, PSK)	Binary, quaternary, 256-QAM symbols	Wi-Fi, LTE
Synthetic biology	Expanded DNA alphabets for data storage or biotechnology	A,C,G,T + artificial bases (e.g. P,Z)	Six-letter DNA aptamers^[14]

Relation to other concepts

Alphabet in formal languages
Character encoding
Source coding and data compression
Channel coding and error-correcting codes
Barcode and QR code, visual digital alphabets

References

^ Selbsterklärende Codes: Papier und Digitale Codierung (Report). Universität Heidelberg. 2016. Retrieved 2025-05-13.
^ Cairncross, Frances (2001). The Death of Distance 2.0. Harvard Business School Press. ISBN 978-1-591-39098-6. {{cite book}}: Check |isbn= value: checksum (help)
^ Shannon, Claude E. (1948). "A Mathematical Theory of Communication". Bell System Technical Journal. 27 (3): 379–423. doi:10.1002/j.1538-7305.1948.tb01338.x.
^ "Baudot Code". Encyclopædia Britannica (online). 2025. Retrieved 2025-05-13.
^ ^a ^b "American Standard Code for Information Interchange (ASCII)". Investopedia. 2024-02-15. Retrieved 2025-05-13.
^ ^a ^b "The Unicode Standard – Technical Introduction". Unicode Consortium. 2023-09-12. Retrieved 2025-05-13.
^ "Émile Baudot Invents the Baudot Code". History of Information. 2024-04-11. Retrieved 2025-05-13.
^ Weisberger, Richard (2017-09-07). "The Roots of Computer Code Lie in Telegraph Code". Smithsonian Magazine. Retrieved 2025-05-13.
^ "International Telegraph Alphabet No. 2 (ITA 2)". International Telecommunication Union. Retrieved 2025-05-13.
^ "Breaking the Language Barrier". Wired. 1993-10-15. Retrieved 2025-05-13.
^ "Emoji Counts v16.0". Unicode Consortium. 2024-09-12. Retrieved 2025-05-13.
^ ^a ^b "Information Theory Lecture Notes" (PDF). University of Auckland. 2004. Retrieved 2025-05-13.
^ Jiao, Jiantao; et al. (2015). "Information Measures: The Curious Case of the Binary Alphabet". IEEE Transactions on Information Theory. 61 (2): 779–800. arXiv:1401.6060. doi:10.1109/TIT.2014.2368555.
^ "Chemists Invent New Letters for Nature's Genetic Alphabet". Wired. 2015-04-07. Retrieved 2025-05-13.

[SelbsterklaerendeCodes-1] Selbsterklärende Codes: Papier und Digitale Codierung (Report). Universität Heidelberg. 2016. Retrieved 2025-05-13.

[Cairncross2001-2] Cairncross, Frances (2001). The Death of Distance 2.0. Harvard Business School Press. ISBN 978-1-591-39098-6. {{cite book}}: Check |isbn= value: checksum (help)

[Shannon1948-3] Shannon, Claude E. (1948). "A Mathematical Theory of Communication". Bell System Technical Journal. 27 (3): 379–423. doi:10.1002/j.1538-7305.1948.tb01338.x.

[BritannicaBaudot-4] "Baudot Code". Encyclopædia Britannica (online). 2025. Retrieved 2025-05-13.

[ASCIIInvestopedia-5] "American Standard Code for Information Interchange (ASCII)". Investopedia. 2024-02-15. Retrieved 2025-05-13.

[UnicodeIntro-6] "The Unicode Standard – Technical Introduction". Unicode Consortium. 2023-09-12. Retrieved 2025-05-13.

[HistoryInfoBaudot-7] "Émile Baudot Invents the Baudot Code". History of Information. 2024-04-11. Retrieved 2025-05-13.

[SmithsonianTelegraph-8] Weisberger, Richard (2017-09-07). "The Roots of Computer Code Lie in Telegraph Code". Smithsonian Magazine. Retrieved 2025-05-13.

[ITA2ITU-9] "International Telegraph Alphabet No. 2 (ITA 2)". International Telecommunication Union. Retrieved 2025-05-13.

[WiredASCII-10] "Breaking the Language Barrier". Wired. 1993-10-15. Retrieved 2025-05-13.

[UnicodeEmoji-11] "Emoji Counts v16.0". Unicode Consortium. 2024-09-12. Retrieved 2025-05-13.

[AucklandNotes-12] "Information Theory Lecture Notes" (PDF). University of Auckland. 2004. Retrieved 2025-05-13.

[Jiao2015-13] Jiao, Jiantao; et al. (2015). "Information Measures: The Curious Case of the Binary Alphabet". IEEE Transactions on Information Theory. 61 (2): 779–800. arXiv:1401.6060. doi:10.1109/TIT.2014.2368555.

[WiredSyntheticDNA-14] "Chemists Invent New Letters for Nature's Genetic Alphabet". Wired. 2015-04-07. Retrieved 2025-05-13.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]