Jump to content

Selectable Mode Vocoder

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Mailer diablo (talk | contribs) at 21:59, 19 November 2004. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

SMV (Selectable Mode Vocoder) is speech coding standard used in CDMA-2000 networks. SMV provides multiple modes of operation that are selected based on input speech characteristics.

The Selectable Mode Vocoder (SMV) for Wideband CDMA is based on 4 codecs: full rate at 8.5 kbps, half rate at 4 kbps, quarter rate at 2 kbps, and eighth rate at 800 bps. The full rate and half rate are based on the eXtended CELP (eX-CELP) algorithm that is based on a combined closed-loop-open-loop-analysis (COLA). In eX-CELP the signal frames are first classified as:

  • Silence/Background noise
  • Non-stationary unvoiced
  • Stationary unvoiced
  • Onset
  • Non-stationary voiced
  • Stationary voiced

The algorithm includes voice activity detection (VAD) followed by an elaborate frame classification scheme. Silence/background noise and stationary unvoiced frames are represented by spectrum modulated noise and coded at 1/4 or 1/8 rate. The SMV uses 4 subframes for full rate and three subframes for half rate. The stochastic (fixed) codebook structure is also elaborate and uses sub-codebooks each tuned for a particular type of speech. The sub-codebooks have different degrees of pulse sparseness (more sparse for noise like excitation). SMV scores a high of 4.1 MOS at full rate with clean speech.

The coder works on a frame of 160 speech samples (20 msec) and requires a look ahead of 80 samples (10 msec) if noise-suppression option B is used. An additional 24 samples of look ahead is required if noise-suppression option A is used. So the algorithmic delay for the coder is 30 msec with noise-suppression option B and 33 msec with noise-suppression option A.