SLIDE 1
Probability basics DS GA 1002 Statistical and Mathematical Models - - PowerPoint PPT Presentation
Probability basics DS GA 1002 Statistical and Mathematical Models - - PowerPoint PPT Presentation
Probability basics DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall16 Carlos Fernandez-Granda Probability spaces Conditional probability Independence General approach Probabilistic modeling
SLIDE 2
SLIDE 3
General approach
Probabilistic modeling
- 1. Model phenomenon of interest as an experiment with several (possibly
infinite) mutually exclusive outcomes
- 2. Group these outcomes in sets called events
- 3. Assign probabilities to the different events
SLIDE 4
Probability space
A probability space is a triple (Ω, F, P) consisting of
◮ A sample space Ω, which contains all possible outcomes of the
experiment
◮ A set of events F, which must be a σ algebra ◮ A probability measure P that assigns probabilities to the events in F
SLIDE 5
Sample space
Sample spaces can be
◮ Discrete: coin toss, score of a basketball game, number of people that
show up at a party . . .
◮ Continuous: intervals of R or Rn used to model time, position,
temperature, . . .
SLIDE 6
σ-algebra
A σ-algebra F is a collection of sets in Ω such that
- 1. If a set S ∈ F then Sc ∈ F
- 2. If the sets S1, S2 ∈ F, then S1 ∪ S2 ∈ F
Also infinite sequences; if S1, S2, . . . ∈ F then ∪∞
i=1Si ∈ F
- 3. Ω ∈ F
SLIDE 7
Basketball game
◮ Cleveland Cavaliers are playing the Golden State Warriors ◮ Sample space
Ω := {Cavs 1 − Warriors 0, Cavs 0 − Warriors 1, . . . , Cavs 101 − Warriors 97, . . .}.
◮ Several possible σ algebras
◮ If we want high granularity we can choose the power set of scores ◮ If we only care who wins
F := {Cavs win, Warriors win, Cavs or Warriors win, ∅}
SLIDE 8
Probability measure
Function over the sets in F such that
- 1. P (S) ≥ 0 for any event S ∈ F
- 2. If S1, S2 ∈ F are disjoint then
P (S1 ∪ S2) = P (S1) + P (S2) Also countably infinite sequences of disjoint sets: S1, S2, . . . ∈ F P
- lim
n→∞ ∪n i=1Si
- = lim
n→∞ n
- i=1
P (Si)
- 3. P (Ω) = 1
SLIDE 9
Properties of a probability measure
◮ P (∅) = 0 ◮ If A ⊆ B then P (A) ≤ P (B) ◮ P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
SLIDE 10
Important
◮ Probability measure only assigns probabilities to events in the σ
algebra
◮ Simpler σ algebras can make our life easy
P (Cavs win) = 1 2 P (Warriors win) = 1 2 P (Cavs or Warriors win) = 1 P (∅) = 0
SLIDE 11
Probability spaces Conditional probability Independence
SLIDE 12
Definition
The conditional probability of an event S′ ∈ F given S is P
- S′|S
- := P (S′ ∩ S)
P (S) P (·|S) is a valid probability measure
SLIDE 13
Example: Flights and rain
Probabilistic model for late arrivals at an airport Ω = {late and rain, late and no rain,
- n time and rain, on time and no rain}
F = power set of Ω, P (late, no rain) = 2 20, P (on time, no rain) = 14 20, P (late, rain) = 3 20, P (on time, rain) = 1 20 P (late|rain) ?
SLIDE 14
Chain rule
For any pair of events A and B P (A ∩ B) = P (A) P (B|A) = P (B) P (A|B) For any sequence of events S1, S2, S3, . . . P (∩iSi) = P (S1) P (S2|S1) P (S3|S1 ∩ S2) . . . =
- i
P
- Si| ∩i−1
j=1 Sj
SLIDE 15
Law of Total Probability
If A1, A2, . . . ∈ F is a partition of Ω
◮ Ai and Aj are disjoint if i = j ◮ Ω = ∪iAi
For any set S ∈ F P (S) =
- i
P (S ∩ Ai) =
- i
P (Ai) P (S|Ai)
SLIDE 16
Example: Flights and rain (continued)
P (rain) = 0.2 P (late|rain) = 0.75 P (late|no rain) = 0.125 P (late) ?
SLIDE 17
Important!
P (A|B) = P (B|A)
SLIDE 18
Bayes’ Rule
Let A1, A2, . . . ∈ F be a partition of Ω For any set S ∈ F P (Ai|S) = P (Ai) P (S|Ai)
- j P (S|Aj) P (Aj)
SLIDE 19
Example: Flights and rain (continued)
P (rain) = 0.2 P (late|rain) = 0.75 P (late|no rain) = 0.125 P (rain|late) ?
SLIDE 20
Example: Flights and rain (continued)
P (rain|late)
SLIDE 21
Example: Flights and rain (continued)
P (rain|late) = P (rain, late) P (late)
SLIDE 22
Example: Flights and rain (continued)
P (rain|late) = P (rain, late) P (late) = P (late|rain) P (rain) P (late|rain) P (rain) + P (late|no rain) P (no rain)
SLIDE 23
Example: Flights and rain (continued)
P (rain|late) = P (rain, late) P (late) = P (late|rain) P (rain) P (late|rain) P (rain) + P (late|no rain) P (no rain) = 0.75 · 0.2 0.75 · 0.2 + 0.125 · 0.8 = 0.6
SLIDE 24
Probability spaces Conditional probability Independence
SLIDE 25
Definition
Two sets A, B are independent if P (A|B) = P (A)
- r equivalently
P (A ∩ B) = P (A) P (B)
SLIDE 26
Conditional independence
A, B are conditionally independent given C if P (A|B, C) = P (A|C) where P (A|B, C) := P (A|B ∩ C), or equivalently P (A ∩ B|C) = P (A|C) P (B|C)
SLIDE 27
Conditional independence does not imply independence
Probabilistic model for taxi availability, flight delay and weather P (rain) = 0.2 P (late|rain) = 0.75 P (late|no rain) = 0.125 P (taxi|rain) = 0.1 P (taxi|no rain) = 0.6 Given rain and no rain, late and taxi are conditionally independent Are they also independent? P (taxi) = P (taxi|late)?
SLIDE 28
Conditional independence does not imply independence
P (taxi)
SLIDE 29
Conditional independence does not imply independence
P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)
SLIDE 30
Conditional independence does not imply independence
P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5
SLIDE 31
Conditional independence does not imply independence
P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late)
SLIDE 32
Conditional independence does not imply independence
P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late)
SLIDE 33
Conditional independence does not imply independence
P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late) = P (taxi, late, rain) + P (taxi, late, no rain) P (late)
SLIDE 34
Conditional independence does not imply independence
P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late) = P (taxi, late, rain) + P (taxi, late, no rain) P (late) = P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r) P (l)
SLIDE 35
Conditional independence does not imply independence
P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late) = P (taxi, late, rain) + P (taxi, late, no rain) P (late) = P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r) P (l) = P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r) P (late)
SLIDE 36
Conditional independence does not imply independence
P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late) = P (taxi, late, rain) + P (taxi, late, no rain) P (late) = P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r) P (l) = P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r) P (late) = 0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8 0.25 = 0.3
SLIDE 37
Conditional independence does not imply independence
P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late) = P (taxi, late, rain) + P (taxi, late, no rain) P (late) = P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r) P (l) = P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r) P (late) = 0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8 0.25 = 0.3 They are not independent
SLIDE 38
Independence does not imply conditional independence
Probabilistic model for mechanical problems, weather and delays P (rain) = 0.2 P (late|rain) = 0.75 P (late|no rain) = 0.125 P (problem) = 0.1 P (late|problem) = 0.7 P (late|no problem) = 0.2 P (late|no rain, problem) = 0.5 problem and no rain are independent Are they also conditionally independent given late? P (problem|late, no rain) = P (problem|late) ?
SLIDE 39
Independence does not imply conditional independence
P (problem|late)
SLIDE 40
Independence does not imply conditional independence
P (problem|late) = P (late, problem) P (late)
SLIDE 41
Independence does not imply conditional independence
P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p)
SLIDE 42
Independence does not imply conditional independence
P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p) = 0.7 · 0.1 0.7 · 0.1 + 0.2 · 0.9 = 0.28
SLIDE 43
Independence does not imply conditional independence
P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p) = 0.7 · 0.1 0.7 · 0.1 + 0.2 · 0.9 = 0.28 P (problem|late, no rain)
SLIDE 44
Independence does not imply conditional independence
P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p) = 0.7 · 0.1 0.7 · 0.1 + 0.2 · 0.9 = 0.28 P (problem|late, no rain) = P (late, no rain, problem) P (late, no rain)
SLIDE 45
Independence does not imply conditional independence
P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p) = 0.7 · 0.1 0.7 · 0.1 + 0.2 · 0.9 = 0.28 P (problem|late, no rain) = P (late, no rain, problem) P (late, no rain) = P (late|no rain, p) P (no rain|p) P (p) P (late|no rain) P (no rain)
SLIDE 46
Independence does not imply conditional independence
P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p) = 0.7 · 0.1 0.7 · 0.1 + 0.2 · 0.9 = 0.28 P (problem|late, no rain) = P (late, no rain, problem) P (late, no rain) = P (late|no rain, p) P (no rain|p) P (p) P (late|no rain) P (no rain) = P (late|no rain, problem) P (no rain) P (problem) P (late|no rain) P (no rain)
SLIDE 47