Jump to content

Covariance and correlation

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 80.71.135.103 (talk) at 19:00, 26 November 2012 (See also correlation and dependence). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Main articles: covariance, correlation.

In probability theory and statistics, the mathematical descriptions of covariance and correlation are very similar.[1][2] Both describe the degree of similarity between two random variables or sets of random variables.

correlation
covariance

where and are the standard deviations of X and Y respectively. Notably, correlation is dimensionless while covariance is in units obtained by multiplying the units of the two variables. The covariance of a variable with itself (i.e. X = Y) is called the variance. The correlation of a variable with itself is always 1 (except in the degenerate case where the two variances are zero, in which case the correlation does not exist).

In the case of a stationary time series, both the means and variances are constant and the covariance and correlation are functions only of the difference in the indices:

cross correlation
cross covariance

Although the values of the theoretical covariances and correlations are linked in the above way, the probability distributions of sample estimates of these quantities are not linked in any simple way and they generally need to be treated separately. These distributions depend on the joint distribution of the pair of random quantities (X,Y) when the values are assumed independent across different pairs. In the case of a time series, the distributions depend on the joint distributions of the whole time-series.

See also

References

  1. ^ Weisstein, Eric W. "Covariance". MathWorld.
  2. ^ Weisstein, Eric W. "Statistical Correlation". MathWorld.