Jump to content

Perceptual coding

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Solarcaine (talk | contribs) at 09:40, 16 March 2008 (moved Perceptual Audio Coding to Perceptual coding: more common name). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Introduction:

Perceptual audio coding is a method of encoding audio that uses psychoacoustic models to discard data said to be unable to be perceived by humans.

Audio Encoding

Perceptual Coding Perceptual audio coding is a method of compression for digital sound which eliminates frequencies the human ear cannot hear. It may also eliminate softer sounds that are being drowned out by louder sounds, which is taking advantage of what is known as Masking.[1] Perceptual audio coding reduces file size, which is desireable since high quality audio with no compression can be large. Perceptual audio coding is a kind of lossy audio compression because it “loses” those portions of the audio signal that we cannot hear. By contrast, lossless compression retains the entire audio signal but encodes repetitive parts as symbols and equations so as to reduce storage space, and upon decoding, reproduce an exact copy of the original signal. Even though lossless compression may seem better since it still keeps all the information, it does not compress as well as lossy compression does. [2]


Sampling Example

Sound is analog. Therefore perceptual audio coding requires conversion of an analog signal, using some kind of audio codec, into a digital representation. Several factors can affect the fidelity with which a digital sample represents the original analog signal, and there is in each case a trade-off between fidelity and the size of a digital file. One factor is the sampling rate, which is simply how often the waveform is measured. Another factor is the sampling resolution, or precision with which each waveform measurement is made; in making this digital representation analog levels are quantized, or measured by rounded to the closest value of the sampling resolution. [3] By adjusting these, and other more subtle analog to digital conversion factors, file sizes can be reduced, though sound quality will be affected.

Digital Audio

Masking Masking as defined by the American Standards Association (ASA) is the amount (or the process) by which the threshold of audibility for one sound is raised by the presence of another (masking) sound (B.C.J. Moore 1982, p. 74)


Compression

As stated earlier, lossless encoding isn't compressed like lossy is. Audio compression is achieved through compression algorithms via small software "codecs" (software enCODer/DECoders[4]). These codecs can be in many different forms, and can use vastly different compression methods, even for similar formats. Utilizing the lossless codecs in your compression methods maintains audio fidelity and allows an exact digital copy of the original material. This results in larger file sizes, but is the standard for many Audio Professionals.


Lossy formats are preferred by the average user who wants to be able to store larger libraries of audio files on their PCs and personal audio devices, and who do not care that the quality may degrade even slightly. Lossy algorithms use dual methods of compression and decompression that alter the original file. Bits that are not used in the original file (or are repetitive) are removed in the compression process. This results in a vastly smaller file size, but depending on the compression ratio, may also result in degraded audio quality or audio Compression artifacts (bits/errors in audio files that result in discernable annoyances to the end user).


References

  1. ^ The Computer Language Company Inc. Pcmag.com encyclopedia. (1981-2007). PC Magazine. Retrieved September 12. 2007 from http://www.pcmag.com/encyclopedia_term/0,2542,t=perceptual+audio+coding&i=49099,00.asp
  2. ^ Fries, Bruce. A digital audio primer. (March 2000). The MP3 and Internet Audio Handbook. Retrieved September 12. 2007 from http://www.teamcombooks.com/mp3handbook/11.htm
  3. ^ The Computer Language Company Inc. Pcmag.com encyclopedia. (1981-2007). PC Magazine. Retrieved September 12. 2007 from http://www.pcmag.com/encyclopedia_term/0,2542,t=sampling&i=50790,00.asp
  4. ^ A Digital Audio Primer. Retrieved on September 14th 2007. http://www.teamcombooks.com/mp3handbook/11.htm

http://ocw.mit.edu/NR/rdonlyres/Health-Sciences-and-Technology/HST-723Spring-2005/F07EC56C-806C-4895-89FD-4DF88A6194EB/0/fmntlprcptlaudio.pdf