[1] 0.3
POLI_SCI 403: Probability and Statistics
Why do we need probability?
Probability theory \(\Rightarrow\) random variables
Peek at Lab 1
You are not allowed to conduct statistical inference unless you are willing to entertain uncertainty in data generation processes
Probability is the language of uncertainty (random events)
But nothing is actually random
Probability theory is a mathematical construct
You are not allowed to conduct statistical inference unless you are willing to entertain uncertainty in data generation processes
Probability is the language of uncertainty (random events)
But nothing is actually random
Probability theory is a mathematical construct (that supports other, more important, equally shaky mathematical constructs)
And yet, any numerical probability… is not an objective property of the world, but a construction based on personal or collective judgements and (often doubtful) assumptions. Furthermore, in most circumstances, it is not even estimating some underlying ‘true’ quantity. Probability, indeed, can only rarely be said to ‘exist’ at all.
“…it handles both chance and ignorance.”
“…any practical use of probability involves subjective judgements.”
Events are uncertain only because we cannot measure with arbitrary precision
What is the probability of landing heads when flipping an unbiased coin?
What is the probability of rolling or with a fair dice?
What about a biased coin? An unfair die?
Why do we need such language?
We are making the problem tractable so that the answer can be something other than “I don’t know”
What is the chance that an earthquake of magnitude 6.7 or greater will occur before the year 2023?
How do we make this tractable?
Symmetry: If outcomes are judged equally likely, then each must have equal probability
Frequentist: Relative frequency with which the event occurs in repeated trials under the same conditions
Bayesian: Probability as degree of belief (0: impossible; 1: sure to happen)
Symmetry: If outcomes are judged equally likely, then each must have equal probability
Frequentist: Relative frequency with which the event occurs in repeated trials under the same conditions
Bayesian: Probability as degree of belief (0: impossible; 1: sure to happen)
Symmetry: If outcomes are judged equally likely, then each must have equal probability
Frequentist: Relative frequency with which the event occurs in repeated trials under the same conditions
Bayesian: Probability as degree of belief (0: impossible; 1: sure to happen)
Symmetry: If outcomes are judged equally likely, then each must have equal probability
Frequentist: Relative frequency with which the event occurs in repeated trials under the same conditions
Bayesian: Probability as degree of belief (0: impossible; 1: sure to happen)
Symmetry: If outcomes are judged equally likely, then each must have equal probability
Frequentist: Relative frequency with which the event occurs in repeated trials under the same conditions
Bayesian: Probability as degree of belief (0: impossible; 1: sure to happen)
Bottomline: Probability does not always make sense
Symmetry makes sense when thinking about quasi-experiments
Frequentism makes sense for weather forecasts and is the basis for random sampling and assignment
Bayes makes a lot of sense for latent variables (e.g. ideal point estimation \(\rightarrow\) item-response theory)
Everything that follows from today is fake
Or, rather, it is held together by a series of heroic, implausible assumptions
And I find that both beautiful AND stupid
But, more importantly, I want you to remember that we use statistical models not because they are true, but because they are useful
So the question about the appropriate method will always be subjective
\[ \Omega = \{1, 2, 3, 4, 5, 6\} \]
\[ \Omega = \{H,T\} \]
\[ \Omega = \{1, 2, 3, 4, 5, 6\} \]
\[ \Omega = \{H,T\} \]
\(\omega \in \Omega\) sampling points
\(\Omega\): sample space
\(S \subseteq \Omega\): event space
\[ A = \omega \in \Omega \colon \omega \text{ is even} = \{2,4,6\} \]
\(\Omega\): sample space
\(S \subseteq \Omega\): event space
\(S\) is an event space if
\(\Omega\): sample space
\(S \subseteq \Omega\): event space
\(P:S \rightarrow \mathbb{R}\): probability measure
These are the basic components required to describe a random generative process
Events \(A\) and \(B\) are independent iff
\[ P(A \cap B) = P(A) P(B) \]
Applying the definition of conditional probabilities (or multiplicative law)
\(A\) and \(B\) are independent iff
\[ P(A|B) = P(A) \]
We now have a language to talk about random generative processes
The next step is to describe or characterize these processes
\(X\) is a random variable.
A random variable is a function such that \(X:\Omega \rightarrow \mathbb{R}\)
A mapping of possible states of the world into a real number
Neither random nor a variable
Informally: A variable that takes a value determined by a random generative process
Except that it never takes an actual value
But we can also apply functions to them (e.g. \(g(X) = X^2\))
A (well-behaved) function of a random variable is also a random variable
We can also operate on them (e.g. \(E[X]\) or \(Pr[X = 1]\))
Because we can “do” things to them, we can also describe them
Discrete
Continuous
\(X\) is discrete if its range \(X(\Omega)\) is a countable set
We can fully characterize the distribution of \(X\) with a Probability Mass Function (PMF)
\[ f(x) = Pr[X = x] \]
This is useful because we can talk about more general cases
\[ f(x) =\begin{cases} 1-p & \colon & x = 0 \\ p & : & x = 1 \\ 0 & : & \text{otherwise} \end{cases} \]
Where \(p\) is the expected proportion of tails
This is a Bernoulli distribution
Which is a special case of a binomial distribution
\[ F(x) = Pr[X \leq x] \]
Returns the probability of \(X\) being greater or equal than \(x\)
This is a more general (and more informative) way to describe a random variable
Informally, a random variable is continuous if its range can be measured to an arbitrary degree of precision
Because working with continuous functions is messier, the definition is recursive
\(X\) is a continuous random variable if there exists a non-negative function \(f: \mathbb{R} \to \mathbb{R}\) such that the CDF of X is an integral
\[ F(x) = Pr[X \leq x] = \int_{-\infty}^x f(u)du \]
Meaning the only way we can tell if a random variable is continuous is because its CDF is also continuous
This turns out to be convenient, because the Probability Density Function of a continuous random variable is a derivative
\[ f(x) = \frac{dF(u)}{du} \bigg\rvert_{u=x} \]
So the PDF gives the slope or rate of change at \(x\)
\(\neq\) the PMF of a discrete RV giving the probability exactly at \(x\)
This turns out to be convenient, because the Probability Density Function of a continuous random variable is a derivative
\[ f(x) = \frac{dF(u)}{du} \bigg\rvert_{u=x} \]
So the PDF gives the slope or rate of change at \(x\)
\(\neq\) the PMF of a discrete RV giving the probability exactly at \(x\)




Random variables as useful placeholders to think about properties of data without looking at data
Everything applies to bivariate, multivariate distributions, the math is just messier
Next week: Summarizing random variables to identify ideal quantities that we will try to approximate with data