SLIDE 1
Convergence of Random Processes DS GA 1002 Probability and - - PowerPoint PPT Presentation
Convergence of Random Processes DS GA 1002 Probability and - - PowerPoint PPT Presentation
Convergence of Random Processes DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall17 Carlos Fernandez-Granda Aim Define convergence for random processes Describe two convergence
SLIDE 2
SLIDE 3
Types of convergence Law of Large Numbers Central Limit Theorem Monte Carlo simulation
SLIDE 4
Convergence of deterministic sequences
A deterministic sequence of real numbers x1, x2, . . . converges to x ∈ R, lim
i→∞ xi = x
if xi is arbitrarily close to x as i grows For any ǫ > 0 there is an i0 such that for all i > i0 |xi − x| < ǫ Problem: Random sequences do not have fixed values
SLIDE 5
Convergence with probability one
Consider a discrete random process X and a random variable X defined on the same probability space If we fix the outcome ω, X (i, ω) is a deterministic sequence and X (ω) is a constant We can determine whether lim
i→∞
- X (i, ω) = X (ω)
for that particular ω
SLIDE 6
Convergence with probability one
- X converges with probability one to X if
P
- ω | ω ∈ Ω,
lim
i→∞
- X (ω, i) = X (ω)
- = 1
Deterministic convergence occurs with probability one
SLIDE 7
Puddle
Initial amount of water is uniform between 0 and 1 gallon After a time interval i there is i times less water
- D (ω, i) := ω
i , i = 1, 2, . . .
SLIDE 8
Puddle
1 2 3 4 5 6 7 8 9 10 0.2 0.4 0.6 0.8 i
- D (ω, i)
ω = 0.31 ω = 0.89 ω = 0.52
SLIDE 9
Puddle
If we fix ω ∈ (0, 1) lim
i→∞
- D (ω, i) = lim
i→∞
ω i = 0
- D converges to zero with probability one
SLIDE 10
Puddle
10 20 30 40 50 0.5 1 i
- D (ω, i)
SLIDE 11
Alternative idea
Idea: Instead of fixing ω and checking deterministic convergence:
- 1. Measure how close
X (i) and X are for a fixed i using a deterministic quantity
- 2. Check whether the quantity tends to zero
SLIDE 12
Convergence in mean square
The mean square of Y − X measures how close X and Y are If E
- (X − Y )2
= 0 then X = Y with probability one Proof: By Markov’s inequality for any ǫ > 0 P
- (Y − X)2 > ǫ
- ≤
E
- (X − Y )2
ǫ = 0
SLIDE 13
Convergence in mean square
- X converges to X in mean square if
lim
i→∞ E
- X −
X (i) 2 = 0
SLIDE 14
Convergence in probability
Alternative measure: Probability that |Y − X| > ǫ for small ǫ
- X converges to X in probability if for any ǫ > 0
lim
i→∞ P
- X −
X (i)
- > ǫ
- = 0
SLIDE 15
- Conv. in mean square implies conv. in probability
lim
i→∞ P
- X −
X (i)
- > ǫ
SLIDE 16
- Conv. in mean square implies conv. in probability
lim
i→∞ P
- X −
X (i)
- > ǫ
- = lim
i→∞ P
- X −
X (i) 2 > ǫ2
SLIDE 17
- Conv. in mean square implies conv. in probability
lim
i→∞ P
- X −
X (i)
- > ǫ
- = lim
i→∞ P
- X −
X (i) 2 > ǫ2
- ≤ lim
i→∞
E
- X −
X (i) 2 ǫ2
SLIDE 18
- Conv. in mean square implies conv. in probability
lim
i→∞ P
- X −
X (i)
- > ǫ
- = lim
i→∞ P
- X −
X (i) 2 > ǫ2
- ≤ lim
i→∞
E
- X −
X (i) 2 ǫ2 = 0
SLIDE 19
- Conv. in mean square implies conv. in probability
lim
i→∞ P
- X −
X (i)
- > ǫ
- = lim
i→∞ P
- X −
X (i) 2 > ǫ2
- ≤ lim
i→∞
E
- X −
X (i) 2 ǫ2 = 0 Convergence with probability one also implies convergence in probability
SLIDE 20
Convergence in distribution
The distribution of ˜ X (i) converges to the distribution of X
- X converges in distribution to X if
lim
i→∞ F X(i) (x) = FX (x)
for all x at which FX is continuous
SLIDE 21
Convergence in distribution
Convergence in distribution does not imply that ˜ X (i) and X are close as i → ∞! Convergence in probability does imply convergence in distribution
SLIDE 22
Binomial tends to Poisson
◮
X (i) is binomial with parameters i and p := λ/i
◮ X is a Poisson random variable with parameter λ ◮
X (i) converges to X in distribution lim
i→∞ p X(i) (x) = lim i→∞
i x
- px (1 − p)(i−x)
= λx e−λ x! = pX (x)
SLIDE 23
Probability mass function of X (40)
10 20 30 40 5 · 10−2 0.1 0.15 k
SLIDE 24
Probability mass function of X (80)
10 20 30 40 5 · 10−2 0.1 0.15 k
SLIDE 25
Probability mass function of X (400)
10 20 30 40 5 · 10−2 0.1 0.15 k
SLIDE 26
Probability mass function of X
10 20 30 40 5 · 10−2 0.1 0.15 k
SLIDE 27
Types of convergence Law of Large Numbers Central Limit Theorem Monte Carlo simulation
SLIDE 28
Moving average
The moving average A of a discrete random process X is
- A (i) := 1
i
i
- j=1
- X (j)
SLIDE 29
Weak law of large numbers
Let X be an iid discrete random process with mean µ
X := µ and
bounded variance σ2 The average A of X converges in mean square to µ
SLIDE 30
Proof
E
- A (i)
SLIDE 31
Proof
E
- A (i)
- = E
1 i
i
- j=1
- X (j)
SLIDE 32
Proof
E
- A (i)
- = E
1 i
i
- j=1
- X (j)
= 1 i
i
- j=1
E
- X (j)
SLIDE 33
Proof
E
- A (i)
- = E
1 i
i
- j=1
- X (j)
= 1 i
i
- j=1
E
- X (j)
- = µ
SLIDE 34
Proof
Var
- A (i)
SLIDE 35
Proof
Var
- A (i)
- = Var
1 i
i
- j=1
- X (j)
SLIDE 36
Proof
Var
- A (i)
- = Var
1 i
i
- j=1
- X (j)
= 1 i2
i
- j=1
Var
- X (j)
SLIDE 37
Proof
Var
- A (i)
- = Var
1 i
i
- j=1
- X (j)
= 1 i2
i
- j=1
Var
- X (j)
- = σ2
i
SLIDE 38
Proof
lim
i→∞ E
- A (i) − µ
2
SLIDE 39
Proof
lim
i→∞ E
- A (i) − µ
2 = lim
i→∞ E
- A (i) − E
- A (i)
2
SLIDE 40
Proof
lim
i→∞ E
- A (i) − µ
2 = lim
i→∞ E
- A (i) − E
- A (i)
2 = lim
i→∞ Var
- A (i)
SLIDE 41
Proof
lim
i→∞ E
- A (i) − µ
2 = lim
i→∞ E
- A (i) − E
- A (i)
2 = lim
i→∞ Var
- A (i)
- = lim
i→∞
σ2 i
SLIDE 42
Proof
lim
i→∞ E
- A (i) − µ
2 = lim
i→∞ E
- A (i) − E
- A (i)
2 = lim
i→∞ Var
- A (i)
- = lim
i→∞
σ2 i = 0
SLIDE 43
Strong law of large numbers
Let X be an iid discrete random process with mean µ
X := µ and
bounded variance σ2 The average A of X converges with probability one to µ
SLIDE 44
iid standard Gaussian
10 20 30 40 50 i 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0
Moving average Mean of iid seq.
SLIDE 45
iid standard Gaussian
100 200 300 400 500 i 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0
Moving average Mean of iid seq.
SLIDE 46
iid standard Gaussian
1000 2000 3000 4000 5000 i 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0
Moving average Mean of iid seq.
SLIDE 47
iid geometric with p = 0.4
10 20 30 40 50 i 2 4 6 8 10 12
Moving average Mean of iid seq.
SLIDE 48
iid geometric with p = 0.4
100 200 300 400 500 i 2 4 6 8 10 12
Moving average Mean of iid seq.
SLIDE 49
iid geometric with p = 0.4
1000 2000 3000 4000 5000 i 2 4 6 8 10 12
Moving average Mean of iid seq.
SLIDE 50
iid Cauchy
10 20 30 40 50 i 5 5 10 15 20 25 30
Moving average Median of iid seq.
SLIDE 51
iid Cauchy
100 200 300 400 500 i 10 5 5 10
Moving average Median of iid seq.
SLIDE 52
iid Cauchy
1000 2000 3000 4000 5000 i 60 50 40 30 20 10 10 20 30
Moving average Median of iid seq.
SLIDE 53
Types of convergence Law of Large Numbers Central Limit Theorem Monte Carlo simulation
SLIDE 54
Central Limit Theorem
Let X be an iid discrete random process with mean µ
X := µ and
bounded variance σ2 √n
- A − µ
- converges in distribution to a Gaussian random variable
with mean 0 and variance σ2 The average A is approximately Gaussian with mean µ and variance σ2/i
SLIDE 55
Height data
◮ Example: Data from a population of 25 000 people ◮ We compare the histogram of the heights and the pdf of a Gaussian
random variable fitted to the data
SLIDE 56
Height data
60 62 64 66 68 70 72 74 76
Height (inches)
0.05 0.10 0.15 0.20 0.25
Gaussian distribution Real data
SLIDE 57
Sketch of proof
Pdf of sum of two independent random variables is the convolution
- f their pdfs
fX+Y (z) = ∞
y=−∞
fX (z − y) fY (y) dy Repeated convolutions of any pdf with bounded variance result in a Gaussian!
SLIDE 58
Repeated convolutions
i = 1 i = 2 i = 3 i = 4 i = 5
SLIDE 59
Repeated convolutions
i = 1 i = 2 i = 3 i = 4 i = 5
SLIDE 60
iid exponential λ = 2, i = 102
0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 1 2 3 4 5 6 7 8 9
SLIDE 61
iid exponential λ = 2, i = 103
0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 5 10 15 20 25 30
SLIDE 62
iid exponential λ = 2, i = 104
0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 10 20 30 40 50 60 70 80 90
SLIDE 63
iid geometric p = 0.4, i = 102
1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 0.5 1.0 1.5 2.0 2.5
SLIDE 64
iid geometric p = 0.4, i = 103
1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 1 2 3 4 5 6 7
SLIDE 65
iid geometric p = 0.4, i = 104
1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 5 10 15 20 25
SLIDE 66
iid Cauchy, i = 102
20 15 10 5 5 10 15 0.05 0.10 0.15 0.20 0.25 0.30
SLIDE 67
iid Cauchy, i = 103
20 15 10 5 5 10 15 0.05 0.10 0.15 0.20 0.25 0.30
SLIDE 68
iid Cauchy, i = 104
20 15 10 5 5 10 15 0.05 0.10 0.15 0.20 0.25 0.30
SLIDE 69
Gaussian approximation to the binomial
X is binomial with parameters n and p Computing the probability that X is in a certain interval requires summing its pmf over the interval Central limit theorem provides a quick approximation X =
n
- i=1
Bi, E (Bi) = p, Var (Bi) = p (1 − p)
1 nX is approximately Gaussian with mean p and variance p (1 − p) /n
X is approximately Gaussian with mean np and variance np (1 − p)
SLIDE 70
Gaussian approximation to the binomial
Basketball player makes shot with probability p = 0.4 (shots are iid) Probability that she makes more than 420 shots out of 1000? Exact answer: P (X ≥ 420) =
1000
- x=420
pX (x) =
1000
- x=420
1000 x
- 0.4x0.6(n−x) = 10.4 10−2
Approximation: P (X ≥ 420)
SLIDE 71
Gaussian approximation to the binomial
Basketball player makes shot with probability p = 0.4 (shots are iid) Probability that she makes more than 420 shots out of 1000? Exact answer: P (X ≥ 420) =
1000
- x=420
pX (x) =
1000
- x=420
1000 x
- 0.4x0.6(n−x) = 10.4 10−2
Approximation: P (X ≥ 420) ≈ P
- np (1 − p)U + np ≥ 420
SLIDE 72
Gaussian approximation to the binomial
Basketball player makes shot with probability p = 0.4 (shots are iid) Probability that she makes more than 420 shots out of 1000? Exact answer: P (X ≥ 420) =
1000
- x=420
pX (x) =
1000
- x=420
1000 x
- 0.4x0.6(n−x) = 10.4 10−2
Approximation: P (X ≥ 420) ≈ P
- np (1 − p)U + np ≥ 420
- = P (U ≥ 1.29)
SLIDE 73
Gaussian approximation to the binomial
Basketball player makes shot with probability p = 0.4 (shots are iid) Probability that she makes more than 420 shots out of 1000? Exact answer: P (X ≥ 420) =
1000
- x=420
pX (x) =
1000
- x=420
1000 x
- 0.4x0.6(n−x) = 10.4 10−2
Approximation: P (X ≥ 420) ≈ P
- np (1 − p)U + np ≥ 420
- = P (U ≥ 1.29)
= 1 − Φ (1.29) = 9.85 10−2
SLIDE 74
Types of convergence Law of Large Numbers Central Limit Theorem Monte Carlo simulation
SLIDE 75
Monte Carlo simulation
Simulation is a powerful tool in probability and statistics Models are too complex to derive closed-form solutions (life is not a homework problem!) Example: Game of solitaire
SLIDE 76
Game of solitaire
Aim: Compute the probability that you win at solitaire If every permutation of the cards has the same probability P (Win) = Number of permutations that lead to a win Total number Problem: Characterizing permutations that lead to a win is very difficult without playing out the game We can’t just check because there are 52! ≈ 8 1067 permutations! Solution: Sample many permutations and compute the fraction of wins
SLIDE 77
In the words of Stanislaw Ulam
The first thoughts and attempts I made to practice (the Monte Carlo Method) were suggested by a question which occurred to me in 1946 as I was convalescing from an illness and playing solitaires. The question was what are the chances that a Canfield solitaire laid out with 52 cards will come out successfully? After spending a lot of time trying to estimate them by pure combinatorial calculations, I wondered whether a more practical method than "abstract thinking" might not be to lay it out say
- ne hundred times and simply observe and count the number of successful
plays.This was already possible to envisage with the beginning of the new era of fast computers.
SLIDE 78
Monte Carlo approximation
Main principle: Use simulation to approximate quantities that are challenging to compute exactly To approximate the probability of an event E
- 1. Generate n independent samples from 1E: I1, I2, . . . , In
- 2. Compute the average of the n samples
- A (n) := 1
n
n
- i=1
Ii By the law of large numbers A converges to P (E) as n → ∞ since E (1E) = P (E)
SLIDE 79
Basketball league
Basketball league with m teams In a season every pair of teams plays once Teams are ordered: team 1 is best, team m is worst Model: For 1 ≤ i < j ≤ m P (team j beats team i) := 1 j − i + 1 Games are independent
SLIDE 80
Basketball league
Aim: Compute probability of team ranks at the end of the season The rank of team i is modeled as a random variable Ri Pmf of R1, R2, . . . , Rm?
SLIDE 81
m = 3
Game outcomes Rank Probability 1-2 1-3 2-3 R1 R2 R3 1 1 2 1 2 3 1/6 1 1 3 1 3 2 1/6 1 3 2 1 1 1 1/12 1 3 3 2 3 1 1/12 2 1 2 2 1 3 1/6 2 1 3 1 1 1 1/6 2 3 2 3 1 2 1/12 2 3 3 3 2 1 1/12
SLIDE 82
m = 3
Probability mass function R1 R2 R3 1 7/12 1/2 5/12 2 1/4 1/4 1/4 3 1/6 1/4 1/3
SLIDE 83
Basketball league
Problem: Number of possible outcomes is 2m(m−1)/2! For m = 10 this is larger than 1013 Solution: Apply Monte Carlo approximation
SLIDE 84
m = 3
Game outcomes Rank 1-2 1-3 2-3 R1 R2 R3 1 3 2 1 1 1 1 1 3 1 3 2 2 1 2 2 1 3 2 3 2 3 1 2 2 1 3 1 1 1 1 1 2 1 2 3 2 1 3 1 1 1 2 3 2 3 1 2 1 1 2 1 2 3 2 3 2 3 1 2
SLIDE 85
m = 3
Estimated pmf (n = 10) R1 R2 R3 1 0.6 (0.583) 0.7 (0.5) 0.3 (0.417) 2 0.1 (0.25) 0.2 (0.25) 0.4 (0.25) 3 0.3 (0.167) 0.1 (0.25) 0.3 (0.333)
SLIDE 86
m = 3
Estimated pmf (n = 2, 000) R1 R2 R3 1 0.582 (0.583) 0.496 (0.5) 0.417 (0.417) 2 0.248 (0.25) 0.261 (0.25) 0.244 (0.25) 3 0.171 (0.167) 0.245 (0.25) 0.339 (0.333)
SLIDE 87
Running times
2 4 6 8 10 12 14 16 18 20 10−3 10−2 10−1 100 101 102 103 Number of teams m Running time (seconds) Exact computation Monte Carlo approx.
SLIDE 88
Error
m Average error 3 9.28 10−3 4 12.7 10−3 5 7.95 10−3 6 7.12 10−3 7 7.12 10−3
SLIDE 89
m = 5
SLIDE 90
m = 20
SLIDE 91