Review of Conditional Probability and Independence Definition L7.3 - - PowerPoint PPT Presentation

▶

Jun 24, 2023 192 likes •310 views

Review of Conditional Probability and Independence Definition L7.3 (Def 1.3.2 on p.20): If A, B S and P ( B ) > 0 , then P ( A | B ) = P ( A B ) . P ( B ) Bayes Rule Theorem L7.2 (Thm 1.3.5 on p.23): Let A 1 , A 2 , . . . be a

SLIDE 1

Review of Conditional Probability and Independence

Definition L7.3 (Def 1.3.2 on p.20): If A, B ∈ S and P(B) > 0, then P(A|B) = P(A ∩ B) P(B) . Bayes’ Rule Theorem L7.2 (Thm 1.3.5 on p.23): Let A1, A2, . . . be a partition of the sample space S and B ⊂ S. If P(B) > 0 and P(Ai) > 0, then P(Ai|B) = P(B|Ai)P(Ai)

j:P(Aj)>0

P(B|Aj)P(Aj) .

19 / 25 Lecture 7: Methods of Estimation

SLIDE 2

Review of Conditional Probability and Independence

Definition L7.4 (Def 4.2.1 on p.148): Let (X, Y ) be a discrete bivariate random vector with joint pmf f(x, y) and marginal pmfs fX(x) and fY (y). For any x such that P(X = x) = fX(x) > 0 , the conditional pmf of Y given that X = x is the function of y defined by f(y|x) = P(Y = y|X = x) = f(x, y) fX(x) . For any y such that P(Y = y) = fY (y) > 0 , the conditional pmf of X given that Y = y is the function of x defined by f(x|y) = P(X = x|Y = y) = f(x, y) fY (y) . If g(Y ) is a function of a discrete random variable Y , then the conditional expected value of g(Y ) given that X = x is E(g(Y )|x) =

g(y)f(y|x).

20 / 25 Lecture 7: Methods of Estimation

SLIDE 3

Review of Conditional Probability and Independence

Definition L7.5 (Def 4.2.3 on p.150): Let (X, Y ) be a continuous bivariate random vector with joint pdf f(x, y) and marginal pdfs fX(x) and fY (y). For any x such that fX(x) > 0 , the conditional pdf of Y given that X = x is the function of y defined by f(y|x) = f(x, y) fX(x) . For any y such that fY (y) > 0 , the conditional pdf of X given that Y = y is the function of x defined by f(x|y) = f(x, y) fY (y) . If g(Y ) is a function of a continuous random variable Y , then the conditional expected value of g(Y ) given that X = x is E(g(Y )|x) = ∞

−∞

g(y)f(y|x) dy.

21 / 25 Lecture 7: Methods of Estimation

SLIDE 4

Bayesian Estimation

The Bayesian approach differs greatly from the classical approach that we have been discussing. In the Bayesian approach, the parameter θ is assumed to be a random variable/vector with prior distribution π(θ). Then we can find update the pdf/pmf of the distribution of θ given data X = x using Bayes’ Rule π(θ|x) = f(x, θ) m(x) = f(x|θ)π(θ) m(x) where m(x) is the pdf/pmf of the marginal distribution of X. The updated prior is referred to as the posterior distribution. The Bayes estimator of θ is obtained by finding the mean of the posterior distribution; that is, ˆ θB = E[θ|X].

22 / 25 Lecture 7: Methods of Estimation

SLIDE 5

Bayesian Estimation

Example L7.7: Let X1, . . . , Xn be a random sample from a Bernoulli(p) distribution. Find the Bayes estimator of p, assuming that the prior distribution on p is beta(α,β). Answer to Example L7.7: Since X1, . . . , Xn are iid Bernoulli(p) random variables, n

i=1 Xi is binomial(n,p).

The posterior distribution of p| n

i=1 Xi = x is

π(p|x) = f(x|p)π(p) m(x) = n

x

px(1 − p)n−x Γ(α+β)

Γ(α)Γ(β)pα−1(1 − p)β−1

1 n

x

px(1 − p)n−x Γ(α+β)

Γ(α)Γ(β)pα−1(1 − p)β−1 dp

= px+α−1(1 − p)n−x+β−1 1

0 px+α−1(1 − p)n−x+β−1 dp

= Γ(n + α + β) Γ(x + α)Γ(n − x + β)px+α−1(1 − p)n−x+β−1I(0,1)(p).

23 / 25 Lecture 7: Methods of Estimation

SLIDE 6

Bayesian Estimation

Answer to Example L7.7 continued: Thus, p| n

i=1 Xi = x

follows a beta(n

i=1 xi + α,n − n i=1 xi + β) distribution.

The Bayes estimator (posterior mean) is ˆ pB = n

i=1 Xi + α

α + β + n =

α + β + n n

i=1 Xi

n +

α + β

α + β + n

α + β . The Bayes estimator is a weighted average of ¯ X (the sample mean based on the data) and E[p] =

α α+β (the mean of the

prior distribution).

24 / 25 Lecture 7: Methods of Estimation

SLIDE 7

Bayesian Estimation

Definition L7.6 (Def 7.2.15 on p.325): Let F denote the class

f pdfs or pmfs f(x|θ) (indexed by θ). A class Π of prior

distributions is a conjugate family for F if the posterior distribution is in the class Π for all f ∈ F, all priors in Π, and all x ∈ X. As seen in Example L7.7, the beta family is conjugate for the binomial family.

25 / 25 Lecture 7: Methods of Estimation

SLIDE 8

Bayesian Tests

Hypothesis testing is much different from a Bayesian perspective where the parameter is considered random. From the Bayesian perspective, the natural approach is to compute P(H0 is true|x) = P(θ ∈ Θ0|x) =

π(θ|x)dθ and P(H1 is true|x) = P(θ ∈ Θc

0|x) =

π(θ|x)dθ based on the posterior distribution π(θ|x).

7 / 14 Lecture 14: More Hypothesis Testing Examples

SLIDE 9

Bayesian Tests

Example L14.2: Suppose we toss a coin 5 times and count the total number of heads which occur. We assume each toss is independent and the probability of heads (denoted by p) is the same on each toss. Consider a Bayesian model which assumes that p follows a Uniform(0, 1) prior. What is the probability of the the null hypothesis H0 : p ≤ .5 if

5

Xi = 5? Answer to Example L14.2: Since p|X = x ∼ beta(5

i=1 xi + 1, 5 − 5 i=1 xi + 1) from

slide 7.24, the probability is P

p ≤ .5
5
i=1

xi = 5

.5 6p5 dp = 1/64 = .015625.

8 / 14 Lecture 14: More Hypothesis Testing Examples

SLIDE 10

Finding a Bayesian Credible Interval

Interval estimators are much different from a Bayesian perspective where the parameter is considered random. Definition L16.5 (p.436): If π(θ|x) is the posterior distribution of θ given X = x, then for any set A ⊂ Θ, the credible probability of A is P(θ ∈ A|x) =

π(θ|x) dθ (assuming θ|x is continuous) and A is a credible set for θ.

23 / 24 Lecture 16: Confidence Intervals

SLIDE 11

Finding a Bayesian Credible Interval

Example L16.5: Suppose X1, . . . , Xn are iid Bernoulli(p) random variables and suppose we consider a Bayesian model which assumes that p follows a Uniform(0, 1) prior. Find a 90% credible set for p for the data set with 4 successes and 14 failures. Answer to Example L16.5: From slide 7.23, p| n

i=1 Xi = y ∼ beta(y + α, n − y + β), we

have p|X = x ∼ beta(4 + 1 = 5, 14 + 1 = 15). So we can find pL such that pL π(p|x) dθ = .05 and pU such that 1

pU π(p|x) dθ = .05 where π(p|x) = 58140p4(1 − p)14.

Using the R commands qbeta(.05,5,15) and qbeta(.95,5,15), we obtain the 90% credible set (.1099, .4191). The shortest 90% credible set (.0953, .3991) can be obtained with the R commands alpha=.02931685;qbeta(c(alpha,.9+alpha),5,15) since > dbeta(qbeta(c(alpha,.9+alpha),5,15),5,15) [1] 1.180588 1.180588

24 / 24 Lecture 16: Confidence Intervals