Perceptual coding

Introduction:

   Perceptual audio coding is a method of encoding audio that uses psychoacoustic models to discard data said to be unable to be perceived by humans.

Perceptual Coding

Perceptual audio coding is a method of compression for digital sound which eliminates frequencies the human ear cannot hear. It may also eliminate softer sounds that are being drowned out by louder sounds, which is taking advantage of what is known as Masking.^[1] Reason to use perceptual audio coding is to reduce file size since high quality audio with no compression can be very big. Perceptual audio coding is a compression known as lossy audio compression because it is “losing” pieces of sounds (the ones removed that we cannot hear). The other type of compression is lossless, which takes repetitive information and encode them to symbols and equations to take up less space. This also allows you to remake an exact copy of the original. Even though lossless compression may seem better since it still keeps all the information, it does not compress as well as lossy compression does. ^[2]

Now let’s back up for a moment. Sound does not start off as digital, it is analog. So to even use perceptual audio coding we need to convert audio into digital. We have to use an audio codec to change analog sound into pulse-code modulation, a digital representation of analog. To get the digital representation as close as possible to the analog, there are a few factors that can be adjusted, and the better these factors are, the bigger the file will be. First factor is sampling rate; this determines how often the waveform is measured. Second factor is sampling resolution which is the measurement of each sample point on a scale, which the scale will have more measure points the higher the resolution is. Then by using the method known as quantization, the number is rounded to the nearest whole number. ^[3] By adjusting these factors, you can reduce the file size as well, though sound quality will be effected.

Critical Bands

Masking

Masking as defined by the American Standards Association (ASA) is the amount (or the process) by which the threshold of audibility for one sound is raised by the presence of another (masking) sound (B.C.J. Moore 1982, p. 74)

Lossless versus Lossy Coding

As stated earlier, lossless encoding isn't compressed like lossy is. Audio compression is achieved through compression algorithms via small software "codecs" (software enCODer/DECoders^[4]). These codecs can be in many different forms, and can use vastly different compression methods, even for similar formats. Utilizing the lossless codecs in your compression methods maintains audio fidelity and allows an exact digital copy of the original material. This results in larger file sizes, but is the standard for many Audio Professionals.

Lossy formats are preferred by the average user who wants to be able to store larger libraries of audio files on their PCs and personal audio devices, and who do not care that the quality may degrade even slightly. Lossy algorithms use dual methods of compression and decompression that alter the original file. Bits that are not used in the original file (or are repetitive) are removed in the compression process. This results in a vastly smaller file size, but depending on the compression ratio, may also result in degraded audio quality or audio "artifacts" (bits/errors in audio files that result in discernable annoyances to the end user^[5]).

Pulse-Code Modulation

This technology-related article is a stub. You can help Wikipedia by expanding it.

References

^ The Computer Language Company Inc.. Pcmag.com encyclopedia. (1981-2007). PC Magazine. Retreived September 12. 2007 from http://www.pcmag.com/encyclopedia_term/0,2542,t=perceptual+audio+coding&i=49099,00.asp
^ Fries, Bruce. A digital audio primer. (March 2000). The MP3 and Internet Audio Handboo. Retreived September 12. 2007 from http://www.teamcombooks.com/mp3handbook/11.htm
^ The Computer Language Company Inc.. Pcmag.com encyclopedia. (1981-2007). PC Magazine. Retreived September 12. 2007 from http://www.pcmag.com/encyclopedia_term/0,2542,t=sampling&i=50790,00.asp
^ A Digital Audio Primer. Retrieved on September 14th 2007. http://www.teamcombooks.com/mp3handbook/11.htm
^ Compression Artifact. Retrieved from Wikipedia on September 16th 2007. http://en.wikipedia.org/wiki/Compression_artifact

[1] The Computer Language Company Inc.. Pcmag.com encyclopedia. (1981-2007). PC Magazine. Retreived September 12. 2007 from http://www.pcmag.com/encyclopedia_term/0,2542,t=perceptual+audio+coding&i=49099,00.asp

[2] Fries, Bruce. A digital audio primer. (March 2000). The MP3 and Internet Audio Handboo. Retreived September 12. 2007 from http://www.teamcombooks.com/mp3handbook/11.htm

[3] The Computer Language Company Inc.. Pcmag.com encyclopedia. (1981-2007). PC Magazine. Retreived September 12. 2007 from http://www.pcmag.com/encyclopedia_term/0,2542,t=sampling&i=50790,00.asp

[4] A Digital Audio Primer. Retrieved on September 14th 2007. http://www.teamcombooks.com/mp3handbook/11.htm

[5] Compression Artifact. Retrieved from Wikipedia on September 16th 2007. http://en.wikipedia.org/wiki/Compression_artifact

[1]

[2]

[3]

[4]

[5]