Talk:Magic compression algorithm
![]() | A fact from Magic compression algorithm appeared on Wikipedia's Main Page in the Did you know column on 2 September 2007. A record of the entry may be seen at Wikipedia:Recent additions/2007/September. | ![]() |
2N-2?
While I was thinking about how to explain the problem nicely, a thought occurred to me. If the compressed sequence has 2N-1 possible combinations, then we have a problem with 0 length files. As an example, consider a plain sequence of N=4 bits - there are 16 such sequences. Then we can use compressed sequences of 3 bits, of which there are 2^3=8; or those of 2 bits, of which there are 2^2=4; or those of 1 bit, of which there are 2^1=2. However, we cannot use those of 0 bits, because there's no file left. Therefore there is no 2^0 term, and there are a total of 2^3+2^2+2^1=14 different compressed sequences, ie. 2N-2. See what I mean? Modest Genius talk 17:41, 28 August 2007 (UTC)
- I do not think there's any reason to exclude the empty string from valid outputs. All modern common OSes have the ability to deal with empty files (though they customarily use byte granularity for determining file length).
- I think I see your point, but consider it an issue of implementation, similar to usage of prefix codes to facilitate padding. If the output stream has additional restrictions on its structure, this would obviously reduce the number of available output streams, but that does not change anything qualitatively: the pigeonhole principle still applies and the proof still stands. It might be a good idea to state "at most" when discussing the (2N-1) number, though, and it might also be clarifying to explicitly exclude the empty string from inputs, because its shortening is obviously impossible. (The counts still hold, though: there's 20 = 1 bit string of length 0, and there are 20-1 = 0 bit strings of length -1.) Digwuren 18:57, 28 August 2007 (UTC)
- Sure, though it's an interesting concept. It would certainly be an impressive algorithm that managed to compress things down to empty files! I suppose I'm thinking too literally, including the file descriptor in it's length, and then worrying about stringing empty files together ;).
- The special case of an empty string probably doesn't need mentioning - after all the algorithm isn't magic if it only works on one specific input, and the impossibility of a sequence of -1 bits is self-evident. Interesting article btw. Modest Genius talk 19:22, 28 August 2007 (UTC)
- The key problem with the text in the article is it assumes that the input sequence is exactly N bits long but the output may be any length up to N bits. If you know your input sequence is a fixed length and you consider it acceptable to permit the zero length ouput and your granularity is bits then you can indeed compress any input. Plugwash 22:18, 17 September 2007 (UTC)
- Not even then. You have different inputs of length , but only different possible outputs of length strictly less than . –Henning Makholm 01:27, 18 September 2007 (UTC)
- The key problem with the text in the article is it assumes that the input sequence is exactly N bits long but the output may be any length up to N bits. If you know your input sequence is a fixed length and you consider it acceptable to permit the zero length ouput and your granularity is bits then you can indeed compress any input. Plugwash 22:18, 17 September 2007 (UTC)
Suggested merge
It seems to me that this article would make better sense as a section in the lossless data compression article. The merge-proposal templates seem to want the discussion to take place on the talk page of the proposed merge target, so please go there and leave your opinion. –Henning Makholm 20:40, 13 September 2007 (UTC)
Description
In reference to [1], I think the article was better with the maths bits in it, and that the description was easier for a layman to understand beforehand. Comments? Modest Genius talk 21:21, 18 September 2007 (UTC)
Time
I think it's quite amusing that while Wikipedia does not allow for unreferenced information, theories, etc. that it does allow for these types of pages. There are algorithms that actually work on maximally random sets, or at least one algorithm.
As always, 'magic' is just a word for extremely advanced technology, sometimes produced naturally through twists of fate. Much as Wikipedia is not for unreferenced theories, i think this 'magic compression algorithm' page is similarly that.. a theory with an faulty reference.
The only caveat is that there is a minimal informational density required for the algorithm to function. I've read the supposed 'proof' of the counterargument against any 'magic' method, but the fact remains.. binary is the worst way to lay out data in 2d that is wanting to be compressed. So doing a 'counting argument' against the string sets in a recursive/folding methodology simply isn't valid... many of the resultant strings are indeed doubled up since they are phasic to the wholey compressed signal. You have to use time to understand time.
I suggest you remove this (primitive/offensive) page. Cudos to the referencers of course for sourcing out the info for this page.. but it has to be valid.