week3

Expected value

\[ E[X] = \sum_x x f(x) \]

\[ E[X] = \int_{-\infty}^{+\infty} xf(x)dx \]

Let \(X\) and \(Y\) be random variables, then \(\forall a, b, c \in \mathbb{R}\)

\[ E[aX + bY + c] = a E[X] + bE[Y] + c \]

\[ V[aX] = a^2 V[X] \]

\[ \sigma[aX] = |a| \sigma[X] \]

\[ MSE = E[(X-c)^2] \]

\[ E[(X-c)^2] = V[X] + (E[X]-c)^2 \]

Covariance

\[ \text{Cov}[X,Y] = E[(X-E[X])(Y-E[Y])] \]

Correlation:

\[ \rho[X,Y] = \frac{\text{Cov}[X,Y]}{\sigma[X] \sigma[Y]} \]

Variance rule: \(V[X+Y] = V[X] + \color{purple}{2\text{Cov}[X,Y]} + V[Y]\)

What does it mean for \(X\) and \(Y\) to be independent?

Knowing the outcome of one random variable provides no information about the probability of any outcome for the other.

\(\rho [X,Y] = 0\) \(\not \Rightarrow\) independence

\(\rho\) is Pearson’s correlation

\[ \rho[X,Y] = \frac{\text{Cov}[X,Y]}{\sigma[X] \sigma[Y]} \]

Spearman’s

\[ r_s = \rho[R[X], R[Y]] = \frac{\text{Cov}[R[X],R[Y]}{\sigma[R[X]] \sigma[R[Y]]} \]

Kendall’s

\[ \tau = \frac{(\text{# concordant pairs}) - (\text{# discordant pairs})} {(\text{# total pairs})} \]

concordant: \(x_i > x_j\) and \(y_i > y_j\) OR \(x_i < x_j\) and \(y_i < y_j\)
discordant otherwise

cor(x, y, method = "pearson") # default

cor(x, y, method = "spearman")

cor(x, y, method = "kendall")

CEF \(E[Y|X]\) minimizes MSE of \(Y\) given \(X\)

If we restrict ourselves to a linear functional form \(Y = a + bX\), then the following minimize MSE of \(Y\) given \(X\):