Image compression

Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. It is essential for any image compression techniques to reduce the diffusion of an image while ensuing the compressed image contains important information of the original image. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior results compared with generic data compression methods which are used for other digital data.^[1]

Image, is a signal in two-dimensional space, where it contains information for representing things visually in the visual system. Due to digitalization, the analog signal of image has to be converted to digital form in order to perform image processing, storing and transmission. Digital image is constructed by a certain amount of pixels, arranged as a 2D array.

Pixel is the basic unit of digital image. It is a point in a digital image, where is contains information of its coordinate and color. According to the image processing system, the pixels can be in different forms, such as dots or squares. A digital image with more pixels, which means it contains more information, in other words, a more accurate and clear image. The color of a pixel in different color systems, will follow distinct standard and be represented diversely.

There are various color systems. RGB and Grayscale are commonly used. For RGB, each pixel has different intensities of red, green, blue light. Digital image which follows RGB are composed of three matrices. These three matrices are containing values of the brightness of corresponding color of each pixel in an image. For example, Image with larger values in the matrix containing information of the red color usually presents as a reddish image. For images presented by grayscale, its color information is the relative intensity of color between black and white. With different bit-depth, the color of an image has distinct number of variations in the intensity between absolute black and white. Usually, image with larger bit-depth, with have a rich gradation of colors.

As mentioned above, the purpose of image compression is to reduce the data redundancies of an image to reduce its amount of data. Various data redundancies are defined and applied to image compression. Coding redundancy, inter-pixel redundancy and perceptual redundancy are the three basic data redundancies that can be made use of. Inter-pixel redundancy occurs when pixels have high correlation with their adjacent or nearby pixels, as prediction can be performed. Perceptual redundancy is a result of subjective vision of human. In ease of understanding, pixels which are close to each other with high degree of similarities usually have unnoticeable difference when observed by human eye.

File:General image compression system block diagram.jpg

Block diagram of general image compression system

For the general structure of an image compression system, it is mainly composed by a encoder and a decoder. The encoder takes image as input and converts the image into a set of symbols for representation. Compression ratio C_R, is defined as the number of information of the original image fed into the encoder, divided by the number of information of the encoded image. Higher compression ratio indicates a smaller size in storage of encoded image comparing to the original image.

In the image encoder, it consists the mapper, quantizer and the symbol coder. First, the image will be processed in the mapper. The mapper performs transformation to the input image, and reduces the inter-pixel redundancies. Next, the transformed image will be processed by the quantizer and result in a lowered accuracy of matching. At last, the symbol decoder generates a code to map the output for matching.

There are some advantages of image compression:

The reduced size of data of an image can lead to time cost reduction in transmissions, furthermore reduces the network usage when transferring images.
With higher compression rate of image compression, the compressed image can reduce the storage consumption, which also eliminates the cost purchasing of storage device or hardware.
Less error will occur due to the less bits required to transferred after performing compression on an image.

Lossy and lossless image compression

Image compression may be lossy or lossless. Lossless compression is preferred for archival purposes and often for medical imaging, technical drawings, clip art, or comics. Lossy compression methods, especially when used at low bit rates, introduce compression artifacts. Lossy methods are especially suitable for natural images such as photographs in applications where minor (sometimes imperceptible) loss of fidelity is acceptable to achieve a substantial reduction in bit rate. Lossy compression that produces negligible differences may be called visually lossless.

Methods for lossy compression:

Transform coding – This is the most commonly used method.
- Discrete Cosine Transform (DCT) – The most widely used form of lossy compression. It is a type of Fourier-related transform, and was originally developed by Nasir Ahmed, T. Natarajan and K. R. Rao in 1974.^[2] The DCT is sometimes referred to as "DCT-II" in the context of a family of discrete cosine transforms (see discrete cosine transform). It is generally the most efficient form of image compression.
  - DCT is used in JPEG, the most popular lossy format, and the more recent HEIF.
- The more recently developed wavelet transform is also used extensively, followed by quantization and entropy coding.
Color quantization - Reducing the color space to a few "representative" colors in the image. The selected colors are specified in the color palette in the header of the compressed image. Each pixel just references the index of a color in the color palette. This method can be combined with dithering to avoid posterization.
- Whole-image palette, typically 256 colors, used in GIF and PNG file formats.
- block palette, typically 2 or 4 colors for each block of 4x4 pixels, used in BTC, CCC, S2TC, and S3TC.
Chroma subsampling. This takes advantage of the fact that the human eye perceives spatial changes of brightness more sharply than those of color, by averaging or dropping some of the chrominance information in the image.
Fractal compression.
More recently, methods based on Machine Learning were applied, using Multilayer perceptrons, Convolutional neural networks and Generative adversarial networks.^[3] Implementations are available in OpenCV, TensorFlow, MATLAB's Image Processing Toolbox (IPT), and the High-Fidelity Generative Image Compression (HiFiC) open source project.^[4]

Methods for lossless compression:

Run-length encoding – used in default method in PCX and as one of possible in BMP, TGA, TIFF
- Uses runs-form data, where runs are sequences that the duplicated data values exist in a row of data^[5].
Area image compression
Predictive coding – used in DPCM
- Perform prediction of pixel value by using the information of its nearby pixels. To reduce the error of the reconstructed image, prediction error is added to each predicted pixel^[5].
Entropy encoding – the two most common entropy encoding techniques are arithmetic coding and Huffman coding
- In the encoding block, if characters and symbols have duplicates after processing the input, the first existence of those symbols are recorded and applied to other duplicates to reduce memory usage.

Adaptive dictionary algorithms such as LZW – used in GIF and TIFF
DEFLATE – used in PNG, MNG, and TIFF
Chain codes
Diffusion models^[6]

Other properties

The best image quality at a given compression rate (or bit rate) is the main goal of image compression, however, there are other important properties of image compression schemes:

Scalability generally refers to a quality reduction achieved by manipulation of the bitstream or file (without decompression and re-compression). Other names for scalability are progressive coding or embedded bitstreams. Despite its contrary nature, scalability also may be found in lossless codecs, usually in form of coarse-to-fine pixel scans. Scalability is especially useful for previewing images while downloading them (e.g., in a web browser) or for providing variable quality access to e.g., databases. There are several types of scalability:

Quality progressive or layer progressive: The bitstream successively refines the reconstructed image.
Resolution progressive: First encode a lower image resolution; then encode the difference to higher resolutions.^[7]^[8]
Component progressive: First encode grey-scale version; then adding full color.

Region of interest coding. Certain parts of the image are encoded with higher quality than others. This may be combined with scalability (encode these parts first, others later).

Meta information. Compressed data may contain information about the image which may be used to categorize, search, or browse images. Such information may include color and texture statistics, small preview images, and author or copyright information.

Processing power. Compression algorithms require different amounts of processing power to encode and decode. Some high compression algorithms require high processing power.

The quality of a compression method often is measured by the peak signal-to-noise ratio. It measures the amount of noise introduced through a lossy compression of the image, however, the subjective judgment of the viewer also is regarded as an important measure, perhaps, being the most important measure.

History

Entropy coding started in the late 1940s with the introduction of Shannon–Fano coding,^[9] the basis for Huffman coding which was published in 1952.^[10] Transform coding dates back to the late 1960s, with the introduction of fast Fourier transform (FFT) coding in 1968 and the Hadamard transform in 1969.^[11]

An important development in image data compression was the discrete cosine transform (DCT), a lossy compression technique first proposed by Nasir Ahmed, T. Natarajan and K. R. Rao in 1973.^[12] JPEG was introduced by the Joint Photographic Experts Group (JPEG) in 1992.^[13] JPEG compresses images down to much smaller file sizes, and has become the most widely used image file format.^[14] JPEG was largely responsible for the wide proliferation of digital images and digital photos,^[15] with several billion JPEG images produced every day as of 2015.^[16]

Lempel–Ziv–Welch (LZW) is a lossless compression algorithm developed by Abraham Lempel, Jacob Ziv and Terry Welch in 1984. It is used in the GIF format, introduced in 1987.^[17] DEFLATE, a lossless compression algorithm developed by Phil Katz and specified in 1996, is used in the Portable Network Graphics (PNG) format.^[18]

The JPEG 2000 standard was developed from 1997 to 2000 by a JPEG committee chaired by Touradj Ebrahimi (later the JPEG president).^[19] In contrast to the DCT algorithm used by the original JPEG format, JPEG 2000 instead uses discrete wavelet transform (DWT) algorithms. It uses the CDF 9/7 wavelet transform (developed by Ingrid Daubechies in 1992) for its lossy compression algorithm,^[20] and the Le Gall–Tabatabai (LGT) 5/3 wavelet transform^[21]^[22] (developed by Didier Le Gall and Ali J. Tabatabai in 1988)^[23] for its lossless compression algorithm.^[20] JPEG 2000 technology, which includes the Motion JPEG 2000 extension, was selected as the video coding standard for digital cinema in 2004.^[24]

Applications

Due to consideration in cost of transmission and storing of data, image compression is an essential technique in real-life applications. For example, Broadcasting take advantages of image compression to reduce the image size in every frame, while maintaining quality as close as the original images. Medical images, magnetic resonance imaging (MRI), radar system, sonar system are also applications that make use of image compression.

Notes and references

^ "Image Data Compression".
^ Nasir Ahmed, T. Natarajan and K. R. Rao, "Discrete Cosine Transform Archived 2011-11-25 at the Wayback Machine," IEEE Trans. Computers, 90–93, Jan. 1974.
^ Gilad David Maayan (Nov 24, 2021). "AI-Based Image Compression: The State of the Art". Towards Data Science. Retrieved 6 April 2023.
^ "High-Fidelity Generative Image Compression". Retrieved 6 April 2023.
^ ^a ^b "Image Compression: ML Techniques and Applications". OpenGenus IQ: Computing Expertise & Legacy. 2021-12-26. Retrieved 2023-06-20.
^ Bühlmann, Matthias (2022-09-28). "Stable Diffusion Based Image Compression". Medium. Retrieved 2022-11-02.
^ Burt, P.; Adelson, E. (1 April 1983). "The Laplacian Pyramid as a Compact Image Code". IEEE Transactions on Communications. 31 (4): 532–540. CiteSeerX 10.1.1.54.299. doi:10.1109/TCOM.1983.1095851. S2CID 8018433.
^ Shao, Dan; Kropatsch, Walter G. (February 3–5, 2010). Špaček, Libor; Franc, Vojtěch (eds.). "Irregular Laplacian Graph Pyramid" (PDF). Computer Vision Winter Workshop 2010. Nové Hrady, Czech Republic: Czech Pattern Recognition Society. Archived (PDF) from the original on 2013-05-27.
^ Claude Elwood Shannon (1948). Alcatel-Lucent (ed.). "A Mathematical Theory of Communication" (PDF). Bell System Technical Journal. 27 (3–4): 379–423, 623–656. doi:10.1002/j.1538-7305.1948.tb01338.x. hdl:11858/00-001M-0000-002C-4314-2. Archived (PDF) from the original on 2011-05-24. Retrieved 2019-04-21.
^ David Albert Huffman (September 1952), "A method for the construction of minimum-redundancy codes" (PDF), Proceedings of the IRE, vol. 40, no. 9, pp. 1098–1101, doi:10.1109/JRPROC.1952.273898, archived (PDF) from the original on 2005-10-08
^ William K. Pratt, Julius Kane, Harry C. Andrews: "Hadamard transform image coding", in Proceedings of the IEEE 57.1 (1969): Seiten 58–68
^ Ahmed, Nasir (January 1991). "How I Came Up With the Discrete Cosine Transform". Digital Signal Processing. 1 (1): 4–5. doi:10.1016/1051-2004(91)90086-Z.
^ "T.81 – DIGITAL COMPRESSION AND CODING OF CONTINUOUS-TONE STILL IMAGES – REQUIREMENTS AND GUIDELINES" (PDF). CCITT. September 1992. Archived (PDF) from the original on 2000-08-18. Retrieved 12 July 2019.
^ "The JPEG image format explained". BT.com. BT Group. 31 May 2018. Retrieved 5 August 2019.
^ "What Is a JPEG? The Invisible Object You See Every Day". The Atlantic. 24 September 2013. Retrieved 13 September 2019.
^ Baraniuk, Chris (15 October 2015). "Copy protections could come to JPEGs". BBC News. BBC. Retrieved 13 September 2019.
^ "The GIF Controversy: A Software Developer's Perspective". 27 January 1995. Retrieved 26 May 2015.
^ L. Peter Deutsch (May 1996). DEFLATE Compressed Data Format Specification version 1.3. IETF. p. 1. sec. Abstract. doi:10.17487/RFC1951. RFC 1951. Retrieved 2014-04-23.
^ Taubman, David; Marcellin, Michael (2012). JPEG2000 Image Compression Fundamentals, Standards and Practice: Image Compression Fundamentals, Standards and Practice. Springer Science & Business Media. ISBN 9781461507994.
^ ^a ^b Unser, M.; Blu, T. (2003). "Mathematical properties of the JPEG2000 wavelet filters" (PDF). IEEE Transactions on Image Processing. 12 (9): 1080–1090. Bibcode:2003ITIP...12.1080U. doi:10.1109/TIP.2003.812329. PMID 18237979. S2CID 2765169. Archived from the original (PDF) on 2019-10-13.
^ Sullivan, Gary (8–12 December 2003). "General characteristics and design considerations for temporal subband video coding". ITU-T. Video Coding Experts Group. Retrieved 13 September 2019.
^ Bovik, Alan C. (2009). The Essential Guide to Video Processing. Academic Press. p. 355. ISBN 9780080922508.
^ Le Gall, Didier; Tabatabai, Ali J. (1988). "Sub-band coding of digital images using symmetric short kernel filters and arithmetic coding techniques". ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing: 761–764 vol.2. doi:10.1109/ICASSP.1988.196696. S2CID 109186495.
^ Swartz, Charles S. (2005). Understanding Digital Cinema: A Professional Handbook. Taylor & Francis. p. 147. ISBN 9780240806174.

External links

Image compression – lecture from MIT OpenCourseWare
A study about image compression – with basics, comparing different compression methods like JPEG2000, JPEG and JPEG XR / HD Photo
Data Compression Basics – includes comparison of PNG, JPEG and JPEG-2000 formats
FAQ:What is the state of the art in lossless image compression? from comp.compression
IPRG Archived 2020-12-28 at the Wayback Machine – an open group related to image processing research resources

[1] "Image Data Compression".

[2] Nasir Ahmed, T. Natarajan and K. R. Rao, "Discrete Cosine Transform Archived 2011-11-25 at the Wayback Machine," IEEE Trans. Computers, 90–93, Jan. 1974.

[3] Gilad David Maayan (Nov 24, 2021). "AI-Based Image Compression: The State of the Art". Towards Data Science. Retrieved 6 April 2023.

[4] "High-Fidelity Generative Image Compression". Retrieved 6 April 2023.

[:0-5] "Image Compression: ML Techniques and Applications". OpenGenus IQ: Computing Expertise & Legacy. 2021-12-26. Retrieved 2023-06-20.

[6] Bühlmann, Matthias (2022-09-28). "Stable Diffusion Based Image Compression". Medium. Retrieved 2022-11-02.

[7] Burt, P.; Adelson, E. (1 April 1983). "The Laplacian Pyramid as a Compact Image Code". IEEE Transactions on Communications. 31 (4): 532–540. CiteSeerX 10.1.1.54.299. doi:10.1109/TCOM.1983.1095851. S2CID 8018433.

[8] Shao, Dan; Kropatsch, Walter G. (February 3–5, 2010). Špaček, Libor; Franc, Vojtěch (eds.). "Irregular Laplacian Graph Pyramid" (PDF). Computer Vision Winter Workshop 2010. Nové Hrady, Czech Republic: Czech Pattern Recognition Society. Archived (PDF) from the original on 2013-05-27.

[Shannon-9] Claude Elwood Shannon (1948). Alcatel-Lucent (ed.). "A Mathematical Theory of Communication" (PDF). Bell System Technical Journal. 27 (3–4): 379–423, 623–656. doi:10.1002/j.1538-7305.1948.tb01338.x. hdl:11858/00-001M-0000-002C-4314-2. Archived (PDF) from the original on 2011-05-24. Retrieved 2019-04-21.

[Huffman-10] David Albert Huffman (September 1952), "A method for the construction of minimum-redundancy codes" (PDF), Proceedings of the IRE, vol. 40, no. 9, pp. 1098–1101, doi:10.1109/JRPROC.1952.273898, archived (PDF) from the original on 2005-10-08

[Hadamard-11] William K. Pratt, Julius Kane, Harry C. Andrews: "Hadamard transform image coding", in Proceedings of the IEEE 57.1 (1969): Seiten 58–68

[Ahmed-12] Ahmed, Nasir (January 1991). "How I Came Up With the Discrete Cosine Transform". Digital Signal Processing. 1 (1): 4–5. doi:10.1016/1051-2004(91)90086-Z.

[t81-13] "T.81 – DIGITAL COMPRESSION AND CODING OF CONTINUOUS-TONE STILL IMAGES – REQUIREMENTS AND GUIDELINES" (PDF). CCITT. September 1992. Archived (PDF) from the original on 2000-08-18. Retrieved 12 July 2019.

[14] "The JPEG image format explained". BT.com. BT Group. 31 May 2018. Retrieved 5 August 2019.

[Atlantic-15] "What Is a JPEG? The Invisible Object You See Every Day". The Atlantic. 24 September 2013. Retrieved 13 September 2019.

[16] Baraniuk, Chris (15 October 2015). "Copy protections could come to JPEGs". BBC News. BBC. Retrieved 13 September 2019.

[cloanto-17] "The GIF Controversy: A Software Developer's Perspective". 27 January 1995. Retrieved 26 May 2015.

[IETF-18] L. Peter Deutsch (May 1996). DEFLATE Compressed Data Format Specification version 1.3. IETF. p. 1. sec. Abstract. doi:10.17487/RFC1951. RFC 1951. Retrieved 2014-04-23.

[19] Taubman, David; Marcellin, Michael (2012). JPEG2000 Image Compression Fundamentals, Standards and Practice: Image Compression Fundamentals, Standards and Practice. Springer Science & Business Media. ISBN 9781461507994.

[Unser-20] Unser, M.; Blu, T. (2003). "Mathematical properties of the JPEG2000 wavelet filters" (PDF). IEEE Transactions on Image Processing. 12 (9): 1080–1090. Bibcode:2003ITIP...12.1080U. doi:10.1109/TIP.2003.812329. PMID 18237979. S2CID 2765169. Archived from the original (PDF) on 2019-10-13.

[21] Sullivan, Gary (8–12 December 2003). "General characteristics and design considerations for temporal subband video coding". ITU-T. Video Coding Experts Group. Retrieved 13 September 2019.

[22] Bovik, Alan C. (2009). The Essential Guide to Video Processing. Academic Press. p. 355. ISBN 9780080922508.

[23] Le Gall, Didier; Tabatabai, Ali J. (1988). "Sub-band coding of digital images using symmetric short kernel filters and arithmetic coding techniques". ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing: 761–764 vol.2. doi:10.1109/ICASSP.1988.196696. S2CID 109186495.

[24] Swartz, Charles S. (2005). Understanding Digital Cinema: A Professional Handbook. Taylor & Francis. p. 147. ISBN 9780240806174.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]