Covariance provides a measure of the strength of the correlation between two or more sets of random variates. The covariance for two random variates and , each with sample size , is defined by the expectation value
(1)
| |||
(2)
|
where and are the respective means, which can be written out explicitly as
(3)
|
For uncorrelated variates,
(4)
|
so the covariance is zero. However, if the variables are correlated in some way, then their covariance will be nonzero. In fact, if , then tends to increase as increases, and if , then tends to decrease as increases. Note that while statistically independent variables are always uncorrelated, the converse is not necessarily true.
In the special case of ,
(5)
| |||
(6)
|
so the covariance reduces to the usual variance . This motivates the use of the symbol , which then provides a consistent way of denoting the variance as , where is the standard deviation.
The derived quantity
(7)
| |||
(8)
|
is called statistical correlation of and .
The covariance is especially useful when looking at the variance of the sum of two random variates, since
(9)
|
The covariance is symmetric by definition since
(10)
|
Given random variates denoted , ..., , the covariance of and is defined by
(11)
| |||
(12)
|
where and are the means of and , respectively. The matrix of the quantities is called the covariance matrix.
The covariance obeys the identities
(13)
| |||
(14)
| |||
(15)
| |||
(16)
|
By induction, it therefore follows that
(17)
| |||
(18)
| |||
(19)
| |||
(20)
| |||
(21)
|