11 Correlation (Pearson’s)

Pearson’s correlation coefficient of the variables \(X\) and \(Y\) is a measure of the linear relationship between \(X\) and \(Y\). It is defined \[\rho = \frac{Cov(X,Y)}{\sqrt{\sigma_X^2\cdot \sigma_Y^2}}\]

Notice that if \(X\) and \(Y\) are independent then \(Cov(X,Y,)=0\) and \(\rho=0\) and there is no linear relationship between the variables.

11.1 Theorems on Pearson’s Correlation

11.2 Computational Formula for \(\rho\)

\[\rho = \frac{\sum\limits_{i=1}^{n}\sum\limits_{j=1}^{m}(x_i-\mu_X)(y_j-\mu_Y)} {\sum\limits_{i=1}^{n}(x_i-\mu_X)\sum\limits_{j=1}^{m}(y_i-\mu_Y)}\]

Proof:

\[\begin{aligned} \rho &= \frac{Cov(X,Y)}{\sqrt{\sigma_X^2\sigma_Y^2}} \\ &= \frac{Cov(X,Y)}{\sqrt{\sigma_X^2\sigma_Y^2}} \\ &= \frac{\sum\limits_{i=1}^{N}(x_i-\mu_X)(y_i-\mu_Y)\frac{1}{N}} {\sqrt{\frac{\sum\limits_{i=1}^{N}(x_i-\mu_X)^2}{N}\frac{\sum\limits_{i=1}^{N}(y_i-\mu_Y)^2}{N}}} \\ &= \frac{\frac{1}{N}\sum\limits_{i=1}^{N}(x_i-\mu_X)(y_j-\mu_Y)} {\frac{1}{N}\sqrt{\sum\limits_{i=1}^{N}(x_i-\mu_X)^2\sum\limits_{i=1}^{N}(y_i-\mu_Y)^2}} \\ &= \frac{\sum\limits_{i=1}^{N}(x_i-\mu_X)(y_i-\mu_Y)} {\sqrt{\sum\limits_{i=1}^{N}(x_i-\mu_X)\sum\limits_{i=1}^{N}(y_i-\mu_Y)}} \end{aligned}\]