Jump to content

Tunstall coding

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Espadrine (talk | contribs) at 10:36, 20 January 2013. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computer science and information theory, Tunstall coding is a form of entropy coding used for lossless data compression.

History

Tunstall coding was the subject of Brian Parker Tunstall's PhD thesis in 1967, while at Georgia Institute of Technology. The subject of that thesis was "Synthesis of noiseless compression codes" [1].

Its design is a precursor to Lempel-Ziv.

Relation to similar codes

Unlike variable-length codes, which include Huffman and Lempel–Ziv coding, Tunstall coding is a code which maps source symbols to a fixed number of bits.

Unlike typical set encoding, Tunstall coding parses a stochastic source with codewords of variable length.

Algorithm

Example

Tunstall "hello, world" example — one iteration

Tunstall "hello, world" example — two iterations

Limitations

Tunstall coding requires the algorithm to know, prior to the parsing operation, what the distribution of probabilities for each letter of the alphabet is. This issue is shared with Huffman coding.

Its requiring a fixed-length word output makes it lesser than Lempel-Ziv, which has a similar dictionary-based design, but with a variable-sized word output.


References

  1. ^ [1]