Jump to content

Talk:Prefix code

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Incnis Mrsi (talk | contribs) at 12:54, 4 March 2011 (With a prefix code, a receiver can identify each word without requiring a special marker between words: new section). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Is UTF8 a prefix code in use today? If so, I think it should be included. 71.158.69.195 22:27, 20 March 2007 (UTC)[reply]

Comma-Free Codes

Comma-Free codes are not the same thing as prefix codes. Any set with 11 (or any code word comprised of only one bit for that matter) cannot be a comma-free code. This is shown in "Comma-Free Codes" by Golomb, Gordon and Welch in the Canadian Journal of Mathematics vol. 10, no. 2, pp. 202-209 (1958).

What definition of "comma" does Golomb use in that paper? --68.0.124.33 (talk) 14:21, 14 April 2008 (UTC)[reply]

Do I get it right that comma-free code is the same as self-synchronizing code? Jaan Vajakas (talk) 16:42, 23 October 2010 (UTC)[reply]

I don't think so, at least according to the definitions given here. The code with the single codeword 0101 is comma-free (the sequence 01010101 formed by two consecutive codewords does not have a codeword starting from its second symbol) but not self-synchronizing (it does have a codeword starting from its third symbol). —David Eppstein (talk) 18:45, 23 October 2010 (UTC)[reply]
The article says that code is comma-free if "the substring starting at the second symbol and ending at the second-last symbol does not contain any codewords". In your example, the aforementioned substring is 101010 and it does contain the codeword 0101 as a substring (but, indeed, not as a prefix). So if you are right then it should be specified in the article that "contains" means "contains as a prefix", to make the definition unambiguous. However, I don't quite see why such a definition would be useful, so I still rather believe that "contains" means "contains as a substring" (and hence comma-free = self-synchronizing). Jaan Vajakas (talk) 20:42, 23 October 2010 (UTC)[reply]
Probably you're right. —David Eppstein (talk) 22:31, 23 October 2010 (UTC)[reply]
I edited the pages Self-synchronizing code and Prefix code to reflect that self-synchronizing code = comma-free code, and redirected Comma-free code to Self-synchronizing code. Jaan Vajakas (talk) 19:28, 4 December 2010 (UTC)[reply]

With a prefix code, a receiver can identify each word without requiring a special marker between words

A very poor wording (although not a false statement), so the reader might imply that prefix property is a criterion of unique decoding of any coded string, while it is in fact a criterion of instant decoding, a slightly stronger requirement. It, sadly, was translated to several other languages.

N=2:

0  0?
01

N=3:

00 0?
001
01 0?

N=4:

000 0?
0001
001 0?
010 0?
0101

All coded strings
starting with 1
or containing
two adjacent 1
are invalid.

There are non-prefix uniquely decodable codes. These seem not have any practical significance, but are perfectly possible. Is the code {0, 01} uniquely decodable? Certainly yes, because it is "suffix code" (a prefix code read from right to left). Are coded strings instantly decodable, transmitted from left to left? Certainly not, but any coded string become decodable after receiving next (no more than) 1 bit. From any leftmost N bits (N>1) of a valid coded string we always can decode either all N bits (if the last bit is 1) or N−1 bits (if the last bit is 0). It is possible to construct even self-synchronized codes like this, which are not prefix codes. Any thoughts? Incnis Mrsi (talk) 12:54, 4 March 2011 (UTC)[reply]