Talk:LZ4 (compression algorithm)
![]() | Computing Start‑class Mid‑importance | |||||||||
|
Removals
- [Copied from User talk:Intgr#Deletion from LZ4]
This is a small matter. As it happens, I don't support your removal of the statement "In a worst-case scenario, incompressible data gets increased by 0.4%." (and sketchy source) from the LZ4 article.
By design, to quote the entire incompressible byte stream adds a quoting overhead of one 0xFF length byte associated with every 255 bytes of uncompressed data. It's slightly more OR than making an unsourced claim that wrapping quote marks around an N character string increases the representation to N+2 characters.
By taking this statement out, you are catering to the common misunderstanding that there is such a thing as a compression algorithm which only ever makes the object smaller. By far this is the more severe of the two evils.
I also don't support this removal (one-time-only IP editor):
https://en.wikipedia.org/w/index.php?title=LZ4_%28compression_algorithm%29&diff=prev&oldid=653088035
Your call on both issues. I'm not going to wade in with my own edits. — MaxEnt 03:13, 20 April 2015 (UTC)
- @MaxEnt: I copied this discussion here from my talk page so other people interested in the article can chime in and/or improve it.
- The first edit that MaxEnt is talking about is this. Admittedly I didn't do a good job at explaining that change in the edit comment.
- I mainly object to the material removed in both edits on sourcing grounds; if there were good sources supporting the claims, I would be all for adding them back. But usually blogs are not considered reliable sources.
- I also tried repeating this on my own and could not reproduce the 0.4% overhead. I tried tons of times, but always arrived at the same result:
% dd if=/dev/urandom bs=100000 count=1 |lz4 |wc -c 1+0 records in 1+0 records out 100000 bytes (100 kB) copied, 0.00602624 s, 16.6 MB/s 100019
- Am I missing something? That's a 0.02% overhead, and it's even lower for larger input sizes. -- intgr [talk] 08:53, 20 April 2015 (UTC)
- Dear intgr, what you are missing is that you measured 0.02% for one specific case. That specific case is not the "worst-case scenario". --DavidCary (talk) 09:13, 30 May 2015 (UTC)