\[ E[X] = \sum_x x f(x) \]
\[ E[X] = \int_{-\infty}^{+\infty} xf(x)dx \]
Let \(X\) and \(Y\) be random variables, then \(\forall a, b, c \in \mathbb{R}\)
\[ E[aX + bY + c] = a E[X] + bE[Y] + c \]
Variance: \(V[X] = E[(X - E[X])^2]\)
Alternative: \(V[X] = E[X^2] - E[X]^2\)
Standard deviation: \(\sigma[X] = \sqrt{V[X]}\)
\[ V[aX] = a^2 V[X] \]
\[ \sigma[aX] = |a| \sigma[X] \]
\[ MSE = E[(X-c)^2] \]
\[ E[(X-c)^2] = V[X] + (E[X]-c)^2 \]
Covariance
\[ \text{Cov}[X,Y] = E[(X-E[X])(Y-E[Y])] \]
Correlation:
\[ \rho[X,Y] = \frac{\text{Cov}[X,Y]}{\sigma[X] \sigma[Y]} \]
Variance rule: \(V[X+Y] = V[X] + \color{purple}{2\text{Cov}[X,Y]} + V[Y]\)
What does it mean for \(X\) and \(Y\) to be independent?
Knowing the outcome of one random variable provides no information about the probability of any outcome for the other.
\(\rho [X,Y] = 0\) \(\not \Rightarrow\) independence
\(\rho\) is Pearson’s correlation
\[ \rho[X,Y] = \frac{\text{Cov}[X,Y]}{\sigma[X] \sigma[Y]} \]
Spearman’s
\[ r_s = \rho[R[X], R[Y]] = \frac{\text{Cov}[R[X],R[Y]}{\sigma[R[X]] \sigma[R[Y]]} \]
Kendall’s
\[ \tau = \frac{(\text{# concordant pairs}) - (\text{# discordant pairs})} {(\text{# total pairs})} \]
concordant: \(x_i > x_j\) and \(y_i > y_j\) OR \(x_i < x_j\) and \(y_i < y_j\)
discordant otherwise
CEF \(E[Y|X]\) minimizes MSE of \(Y\) given \(X\)
If we restrict ourselves to a linear functional form \(Y = a + bX\), then the following minimize MSE of \(Y\) given \(X\):