Digital sampling

Digital sampling, PCM sampling, or just sampling is the process of representing a signal waveform as a series of numbers which represent the measurement of the signal'samplitude, taken at regular intervals. This process, also commonly referred to as PCM is widely used in modern audio and video systems, including television and telephone networks.

Strictly speaking, the process of sampling must be regarded as separate from the process of digitising. Sampling produces a series of values which may be represented in various ways - the output from the process can be a series of analog pulses (Pulse-height modulation) or a series of fixed amplitude pulses (Pulse position modulation or Pulse-width modulation. Most commonly though, the samples are represented by binary numbers, in a process known as PCM, an acronym for Pulse-code modulation, because they are then amenable to storage and processing in digital systems such as computers.

The basic theory of digital sampling in relation to audio and video is widely misrepresented, based quite simply on the totally wrong idea that the samples are the signal. This misconception is understandable, given that it is indeed possible to listen to digital samples directly (as is done in some cheap players) or to view video samples directly (as is done in most standard (non-HD) televisions) but this must be regarded as a 'cheap and cheereful' approach, and misses out a vital component of basic sampling theory - the Reconstruction filter.

While it is obvious to anyone that sampling an original waveform and then presenting the sample values as joined-up segments will produce an approximation to the original, especially if the original only changes slowly between samples, this is not what sampling is really about.

What Nyquist realised was that if an original signal is filtered (band-limited) to remove all frequencies above what we call the Nyquist frequency, then it is possible to reproduce the exact (band-limited) waveform by processing the samples in a 'Reconstruction filter which is simply another low-pass filter with a cut-off frequency uqual to the Nyquist frequency. There is no approximation, no distortion, what goes in comes out, apart from any components above the Nyquist frequency.

Errors resulting from the Nyquist limitation

This is only literally true if the two filters employed are 'Brick-wall filters, in other words they cut off totally above the Nyquist frequency. Even if such filters were realisable in practice basic theory says that they would have infinite delay - they would take forever to produce any output. This must not be seen as an obstacle to perfect reproduction though. By designing with a 'Guard band it is possible to use imperfect filters to obtain output that is as accurate as we care to make it (within the bandwidth limitation).

Quantising errors resulting from the process of digitisation

In digital sampling, the accuracy of the resulting waveform is also affected by the stepwise nature of the digitising process, resulting in what is referred to as 'Quantisation error. This error, which occurs from sample to sample, is not necessarily random, but may be correlated with the signal, producing serious audible distortion in audio systems that do not take steps to eliminate it. Some early CD's suffered from Quantising distortion which was especially audible on quiet piano notes, adding a granular noise that sounded like 'sand in the speakers'. It could also be heard as spurious tones accompanying higher frequencies. Quantising distortion soon became a thing of the past though, with a better understanding of the process of 'Dither' which involved adding a low level of noise to the signal before sampling in order to randomise the individual sample errors and hence 'de-correlate' the resultant errors from the signal, so that all that was heard was noise (hiss).

Digital sampling in Audio

Audio waveforms are commonly sampled at 44.1k samples/s (CD) or 48k samples/s (professionalaudio). CD's use 16-bit digital representation, and would sound 'granular' because of the quantising noise, were it not for the addition of a small amount of noise to the signal before digitisation, known as 'dither'. Adding dither eliminates this granularity, and gives very low distortion, but at the expense of a small increase in noise level. Measured ITU-R 468 weighted, this is about 66dB below alignment level, or 84dB below FS (full scale) digital, which is somewhat lower than the microphone noise level on most recordings, and hence of no consequence (see Programme levels for more on this).

Optimising dither waveforms

In a seminal paper published in the AES Journal Lipschitz and Vanderkoy pointed out that different noise types, with different Probability density functions (PDF's) behave differently when used as dither signals, and suggested optimal levels of dither signal for audio. Gaussian noise requires a higher level for full elimination of distortion than Rectangular PDF or Triangular PDF noise. Triangular PDF noise has the advantage of requiring a lower level of added noise to eliminate distortion and also minimising 'noise modulation'. The latter refers to audible changes in the residual noise on low level music that are found to draw attention to the noise.

Noise shaping for lower audibility

An alternative to dither is noise shaping, which involves a feedback process in which the final digitised signal is compared with the original, and the instantaneous errors on successive past samples integrated and used to determine whether the next sample is rounded up or down. This smooths out the errors in a way that alters the spectral noise content. By the neat device of inserting a weighting filter in the feedback path the spectral content of the noise can be shifted to areas of the 'equal-loudness contours' where the human ear is least sensitive, producing a lower subjective noise level (-68/-70dB typically ITU-R 468 weighted).

24-bit and 96kHz pro-audio formats

24-bit audio does not require dithering, the noise level of the digital convertor being far higher in practise than the required level of any dither that might be applied.

The recent trend towards higher sampling rates, at two or four times the basic requirement, has not been justified theoretically, or shown to make any audible difference, even under the most critical listening conditions, but nevertheless a lot of 96kHz equipment is now used in studio recording, and 'superaudio' formats are being promised to consumers, mostly as a DVD option.

Digital sampling in Video

Dither not needed - contouring tolerable. 12-bit for best results

- Work in progress

Why most TV's do not acheive basic SD quality - because they do not reconstruct the vertically sampled image properly. HD sets do this, producing much better resolution pictures than even a top studio monitor can from SD signals! Again, not commonly realised, though it was realised by the BBC who then backed off from broadcasting HD but started to record programmes in HD

To get a true HD image you need a 'super HD' display, with at least twice as many pixels again (3840 x 2160)!! Worth bearing in mind though not currently practical. HD still pretty good though, and does give higher Nyquist limit.