Probability basics DS GA 1002 Statistical and Mathematical Models - - PowerPoint PPT Presentation

probability basics
SMART_READER_LITE
LIVE PREVIEW

Probability basics DS GA 1002 Statistical and Mathematical Models - - PowerPoint PPT Presentation

Probability basics DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall16 Carlos Fernandez-Granda Probability spaces Conditional probability Independence General approach Probabilistic modeling


slide-1
SLIDE 1

Probability basics

DS GA 1002 Statistical and Mathematical Models

http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall16 Carlos Fernandez-Granda

slide-2
SLIDE 2

Probability spaces Conditional probability Independence

slide-3
SLIDE 3

General approach

Probabilistic modeling

  • 1. Model phenomenon of interest as an experiment with several (possibly

infinite) mutually exclusive outcomes

  • 2. Group these outcomes in sets called events
  • 3. Assign probabilities to the different events
slide-4
SLIDE 4

Probability space

A probability space is a triple (Ω, F, P) consisting of

◮ A sample space Ω, which contains all possible outcomes of the

experiment

◮ A set of events F, which must be a σ algebra ◮ A probability measure P that assigns probabilities to the events in F

slide-5
SLIDE 5

Sample space

Sample spaces can be

◮ Discrete: coin toss, score of a basketball game, number of people that

show up at a party . . .

◮ Continuous: intervals of R or Rn used to model time, position,

temperature, . . .

slide-6
SLIDE 6

σ-algebra

A σ-algebra F is a collection of sets in Ω such that

  • 1. If a set S ∈ F then Sc ∈ F
  • 2. If the sets S1, S2 ∈ F, then S1 ∪ S2 ∈ F

Also infinite sequences; if S1, S2, . . . ∈ F then ∪∞

i=1Si ∈ F

  • 3. Ω ∈ F
slide-7
SLIDE 7

Basketball game

◮ Cleveland Cavaliers are playing the Golden State Warriors ◮ Sample space

Ω := {Cavs 1 − Warriors 0, Cavs 0 − Warriors 1, . . . , Cavs 101 − Warriors 97, . . .}.

◮ Several possible σ algebras

◮ If we want high granularity we can choose the power set of scores ◮ If we only care who wins

F := {Cavs win, Warriors win, Cavs or Warriors win, ∅}

slide-8
SLIDE 8

Probability measure

Function over the sets in F such that

  • 1. P (S) ≥ 0 for any event S ∈ F
  • 2. If S1, S2 ∈ F are disjoint then

P (S1 ∪ S2) = P (S1) + P (S2) Also countably infinite sequences of disjoint sets: S1, S2, . . . ∈ F P

  • lim

n→∞ ∪n i=1Si

  • = lim

n→∞ n

  • i=1

P (Si)

  • 3. P (Ω) = 1
slide-9
SLIDE 9

Properties of a probability measure

◮ P (∅) = 0 ◮ If A ⊆ B then P (A) ≤ P (B) ◮ P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

slide-10
SLIDE 10

Important

◮ Probability measure only assigns probabilities to events in the σ

algebra

◮ Simpler σ algebras can make our life easy

P (Cavs win) = 1 2 P (Warriors win) = 1 2 P (Cavs or Warriors win) = 1 P (∅) = 0

slide-11
SLIDE 11

Probability spaces Conditional probability Independence

slide-12
SLIDE 12

Definition

The conditional probability of an event S′ ∈ F given S is P

  • S′|S
  • := P (S′ ∩ S)

P (S) P (·|S) is a valid probability measure

slide-13
SLIDE 13

Example: Flights and rain

Probabilistic model for late arrivals at an airport Ω = {late and rain, late and no rain,

  • n time and rain, on time and no rain}

F = power set of Ω, P (late, no rain) = 2 20, P (on time, no rain) = 14 20, P (late, rain) = 3 20, P (on time, rain) = 1 20 P (late|rain) ?

slide-14
SLIDE 14

Chain rule

For any pair of events A and B P (A ∩ B) = P (A) P (B|A) = P (B) P (A|B) For any sequence of events S1, S2, S3, . . . P (∩iSi) = P (S1) P (S2|S1) P (S3|S1 ∩ S2) . . . =

  • i

P

  • Si| ∩i−1

j=1 Sj

slide-15
SLIDE 15

Law of Total Probability

If A1, A2, . . . ∈ F is a partition of Ω

◮ Ai and Aj are disjoint if i = j ◮ Ω = ∪iAi

For any set S ∈ F P (S) =

  • i

P (S ∩ Ai) =

  • i

P (Ai) P (S|Ai)

slide-16
SLIDE 16

Example: Flights and rain (continued)

P (rain) = 0.2 P (late|rain) = 0.75 P (late|no rain) = 0.125 P (late) ?

slide-17
SLIDE 17

Important!

P (A|B) = P (B|A)

slide-18
SLIDE 18

Bayes’ Rule

Let A1, A2, . . . ∈ F be a partition of Ω For any set S ∈ F P (Ai|S) = P (Ai) P (S|Ai)

  • j P (S|Aj) P (Aj)
slide-19
SLIDE 19

Example: Flights and rain (continued)

P (rain) = 0.2 P (late|rain) = 0.75 P (late|no rain) = 0.125 P (rain|late) ?

slide-20
SLIDE 20

Example: Flights and rain (continued)

P (rain|late)

slide-21
SLIDE 21

Example: Flights and rain (continued)

P (rain|late) = P (rain, late) P (late)

slide-22
SLIDE 22

Example: Flights and rain (continued)

P (rain|late) = P (rain, late) P (late) = P (late|rain) P (rain) P (late|rain) P (rain) + P (late|no rain) P (no rain)

slide-23
SLIDE 23

Example: Flights and rain (continued)

P (rain|late) = P (rain, late) P (late) = P (late|rain) P (rain) P (late|rain) P (rain) + P (late|no rain) P (no rain) = 0.75 · 0.2 0.75 · 0.2 + 0.125 · 0.8 = 0.6

slide-24
SLIDE 24

Probability spaces Conditional probability Independence

slide-25
SLIDE 25

Definition

Two sets A, B are independent if P (A|B) = P (A)

  • r equivalently

P (A ∩ B) = P (A) P (B)

slide-26
SLIDE 26

Conditional independence

A, B are conditionally independent given C if P (A|B, C) = P (A|C) where P (A|B, C) := P (A|B ∩ C), or equivalently P (A ∩ B|C) = P (A|C) P (B|C)

slide-27
SLIDE 27

Conditional independence does not imply independence

Probabilistic model for taxi availability, flight delay and weather P (rain) = 0.2 P (late|rain) = 0.75 P (late|no rain) = 0.125 P (taxi|rain) = 0.1 P (taxi|no rain) = 0.6 Given rain and no rain, late and taxi are conditionally independent Are they also independent? P (taxi) = P (taxi|late)?

slide-28
SLIDE 28

Conditional independence does not imply independence

P (taxi)

slide-29
SLIDE 29

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain)

slide-30
SLIDE 30

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5

slide-31
SLIDE 31

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late)

slide-32
SLIDE 32

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late)

slide-33
SLIDE 33

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late) = P (taxi, late, rain) + P (taxi, late, no rain) P (late)

slide-34
SLIDE 34

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late) = P (taxi, late, rain) + P (taxi, late, no rain) P (late) = P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r) P (l)

slide-35
SLIDE 35

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late) = P (taxi, late, rain) + P (taxi, late, no rain) P (late) = P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r) P (l) = P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r) P (late)

slide-36
SLIDE 36

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late) = P (taxi, late, rain) + P (taxi, late, no rain) P (late) = P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r) P (l) = P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r) P (late) = 0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8 0.25 = 0.3

slide-37
SLIDE 37

Conditional independence does not imply independence

P (taxi) = P (taxi|rain) P (rain) + P (taxi|no rain) P (no rain) = 0.1 · 0.2 + 0.6 · 0.8 = 0.5 P (taxi|late) = P (taxi, late) P (late) = P (taxi, late, rain) + P (taxi, late, no rain) P (late) = P (t|l, r) P (l|r) P (r) + P (t|l, no r) P (l|no r) P (no r) P (l) = P (taxi|r) P (late|r) P (r) + P (taxi|no r) P (late|no r) P (no r) P (late) = 0.1 · 0.75 · 0.2 + 0.6 · 0.125 · 0.8 0.25 = 0.3 They are not independent

slide-38
SLIDE 38

Independence does not imply conditional independence

Probabilistic model for mechanical problems, weather and delays P (rain) = 0.2 P (late|rain) = 0.75 P (late|no rain) = 0.125 P (problem) = 0.1 P (late|problem) = 0.7 P (late|no problem) = 0.2 P (late|no rain, problem) = 0.5 problem and no rain are independent Are they also conditionally independent given late? P (problem|late, no rain) = P (problem|late) ?

slide-39
SLIDE 39

Independence does not imply conditional independence

P (problem|late)

slide-40
SLIDE 40

Independence does not imply conditional independence

P (problem|late) = P (late, problem) P (late)

slide-41
SLIDE 41

Independence does not imply conditional independence

P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p)

slide-42
SLIDE 42

Independence does not imply conditional independence

P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p) = 0.7 · 0.1 0.7 · 0.1 + 0.2 · 0.9 = 0.28

slide-43
SLIDE 43

Independence does not imply conditional independence

P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p) = 0.7 · 0.1 0.7 · 0.1 + 0.2 · 0.9 = 0.28 P (problem|late, no rain)

slide-44
SLIDE 44

Independence does not imply conditional independence

P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p) = 0.7 · 0.1 0.7 · 0.1 + 0.2 · 0.9 = 0.28 P (problem|late, no rain) = P (late, no rain, problem) P (late, no rain)

slide-45
SLIDE 45

Independence does not imply conditional independence

P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p) = 0.7 · 0.1 0.7 · 0.1 + 0.2 · 0.9 = 0.28 P (problem|late, no rain) = P (late, no rain, problem) P (late, no rain) = P (late|no rain, p) P (no rain|p) P (p) P (late|no rain) P (no rain)

slide-46
SLIDE 46

Independence does not imply conditional independence

P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p) = 0.7 · 0.1 0.7 · 0.1 + 0.2 · 0.9 = 0.28 P (problem|late, no rain) = P (late, no rain, problem) P (late, no rain) = P (late|no rain, p) P (no rain|p) P (p) P (late|no rain) P (no rain) = P (late|no rain, problem) P (no rain) P (problem) P (late|no rain) P (no rain)

slide-47
SLIDE 47

Independence does not imply conditional independence

P (problem|late) = P (late, problem) P (late) = P (late|p) P (p) P (late|p) P (p) + P (late|no p) P (no p) = 0.7 · 0.1 0.7 · 0.1 + 0.2 · 0.9 = 0.28 P (problem|late, no rain) = P (late, no rain, problem) P (late, no rain) = P (late|no rain, p) P (no rain|p) P (p) P (late|no rain) P (no rain) = P (late|no rain, problem) P (no rain) P (problem) P (late|no rain) P (no rain) = 0.5 · 0.1 0.125 = 0.4