Tunstall coding
This article, Tunstall coding, has recently been created via the Articles for creation process. Please check to see if the reviewer has accidentally left this template after accepting the draft and take appropriate action as necessary.
Reviewer tools: Inform author |
In computer science and information theory, Tunstall coding is a form of entropy coding used for lossless data compression.
History
Tunstall coding was the subject of Brian Parker Tunstall's PhD thesis in 1967, while at Georgia Institute of Technology. The subject of that thesis was "Synthesis of noiseless compression codes" [1].
Its design is a precursor to Lempel-Ziv.
Relation to similar codes
Unlike variable-length codes, which include Huffman and Lempel–Ziv coding, Tunstall coding is a code which maps source symbols to a fixed number of bits.
Unlike typical set encoding, Tunstall coding parses a stochastic source with codewords of variable length.
Algorithm
Example
Tunstall coding — one iteration.
Tunstall coding — two iterations.
Limitations
Tunstall coding requires the algorithm to know, prior to the parsing operation, what the distribution of probabilities for each letter of the alphabet is. This issue is shared with Huffman coding.
Its requiring a fixed-length word output makes it lesser than Lempel-Ziv, which has a similar dictionary-based design, but with a variable-sized word output.
References