Jump to content

Data compression ratio

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by AnomieBOT (talk | contribs) at 05:53, 19 July 2011 (Dating maintenance tags: {{Disputed}}). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Data compression ratio, also known as compression power, is a computer-science term used to quantify the reduction in data-representation size produced by a data compression algorithm. The data compression ratio is analogous to the physical compression ratio used to measure physical compression of substances, and is defined in the same way, as the ratio between the compressed size and the uncompressed size: [1]

Thus a representation that compresses a 10MB file to 2MB has a compression ratio of 2/10 = 0.2, often notated as an explicit ratio, 1:5 (read "one to five"), or as an implicit ratio, 1/5. Note that this formulation applies equally for compression, where the uncompressed size is that of the original; and for decompression, where the uncompressed size is that of the reproduction.

Sometimes the space savings is given instead, which is defined as the reduction in size relative to the uncompressed size:

Thus a representation that compresses a 10MB file to 2MB would yield a space savings of 1 - 2/10 = 0.8, often notated as a percentage, 80%.

For signals of indefinite size, such as streaming audio and video, the compression ratio is defined in terms of uncompressed and compressed data rates instead of data sizes:

and instead of space savings, one speaks of data-rate savings, which is defined as the data-rate reduction relative to the uncompressed data rate:

For example, uncompressed songs in CD format have a data rate of 16 bits/channel x 2 channels x 44.1 kHz ≅ 1.4 Mbit/s, whereas AAC files on an iPod are typically compressed to 128 kbit/s, yielding a compression ratio of 0.09, for a data-rate savings of 0.91, or 91%.

When the uncompressed data rate is known, the compression ratio can be inferred from the compressed data rate.

Note: There is some confusion about the term 'compression ratio', particularly outside academia and commerce. In particular, some authors use the term 'compression ratio' to mean 'space savings', even though the latter is not a ratio; and others use the term 'compression ratio' to mean its inverse, even though that equates higher compression ratio with lower compression.

Lossless compression of digitized data such as video, digitized film, and audio preserves all the information, but can rarely do much better than 1:2 compression because of the intrinsic entropy of the data. In contrast, lossy compression (for example JPEG, or MP3) can achieve much higher compression ratios at the cost of a decrease in quality, as visual or audio compression artifacts from loss of important information are introduced.

References

  1. ^ [Data Compression: The Complete Reference. 4th Edition. David Salomon (with contributions by Giovanni Motta and David Bryant). Published by Springer (Dec 2006). ISBN 1-84628-602-6.]