[PPT] - The effect of prior information on frequentist properties of Bayes PowerPoint Presentation

SLIDE 1

The effect of prior information on frequentist properties of Bayes test decisions

Annette Kopp-Schneider, Silvia Calderazzo and Manuel Wiesenfarth

Division of Biostatistics, German Cancer Research Center (DKFZ) Heidelberg, Germany

GMDS-CEN conference Satellite Webinar “Long-run behaviour of Bayesian procedures“ 16 September 2020

SLIDE 2

2 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

Motivation

Trial in adults with solid tumors harboring DNA repair deficiencies treated by

targeted therapy, evaluation of response.

DNA repair deficiencies also occur in pediatric tumors

→ investigate targeted therapy in a pediatric arm Question: Should this pediatric arm be designed as stand-alone arm

r

can power gain be expected when borrowing information from the adult trial?

SLIDE 3

3 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

Number of responders in children, 𝑆𝑞𝑓𝑒 ~ Bin(𝑜𝑞𝑓𝑒 = 40, 𝑞)
One-sided test 𝐼0: 𝑞 ≤ 𝑞0 vs. 𝐼1: 𝑞 > 𝑞0, 𝑞0 = 0.2
Type I error rate 𝛽 = 0.05

Planning the pediatric arm with stand-alone evaluation Bayesian approach (1)

Use beta-binomial model

𝑆𝑞𝑓𝑒 | 𝑞 ~ Bin(𝑜𝑞𝑓𝑒, 𝑞), 𝜌 𝑞 = Beta(0.5, 0.5)

Evaluate efficacy based on Bayesian posterior probability:

Reject 𝐼0 ֞ 𝑄 𝑞 > 𝑞0 = 0.2|𝑠

𝑞𝑓𝑒

≥ 𝑑, e.g., 𝑑 = 0.95.

SLIDE 4

4 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

Posterior probability 𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒 as a function of 𝑠 𝑞𝑓𝑒

𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒 ≥ 0.95

֞ 𝑠

𝑞𝑓𝑒 ≥ 13

Planning the pediatric arm with stand-alone evaluation: Bayesian approach (2)

SLIDE 5

5 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

Posterior probability 𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒 as a function of 𝑠 𝑞𝑓𝑒

𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒 ≥ 0.95

֞ 𝑠

𝑞𝑓𝑒 ≥ 13

Planning the pediatric arm with stand-alone evaluation: Bayesian approach (2)

SLIDE 6

6 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

Uniformly most powerful (UMP) level 𝛽 test is given by:

reject 𝐼0 ֞ 𝑠

𝑞𝑓𝑒 ≥ 𝑐UMP 𝛽

Here: 𝑐UMP 0.05 = 13
All possible power curves for 𝑜𝑞𝑓𝑒 = 40 for varying threshold 𝑐 (and type I

error probability):

Planning the pediatric arm with stand-alone evaluation: Frequentist approach

Power: 𝑄 𝑆𝑞𝑓𝑒 ≥ 𝑐|𝑞𝑢𝑠𝑣𝑓

𝑞𝑢𝑠𝑣𝑓

𝑐 = 13

SLIDE 7

7 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

Use results from adult trial to inform the prior for the pediatric arm. Hope If treatment is successful in adults, then power is increased for pediatric arm:

Borrowing from adult information for the pediatric arm

Pediatric only Pediatric with borrowing from adult

𝑞𝑢𝑠𝑣𝑓

Power

?

SLIDE 8

8 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

Power prior approach with power parameter 𝜀 ∈ 0, 1 : 𝜌 𝑞|𝑠

𝑏𝑒𝑣, 𝜀

∝ 𝑀 𝑞; 𝑠

𝑏𝑒𝑣 𝜀𝜌 𝑞

Adapt 𝜀 = 𝜀 𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣 such that information is only borrowed for similar adult and

pediatric data: → 𝜀 𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣 large when adult and children data are similar

→ 𝜀 𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣 small in case of prior-data conflict.

Adaptive power parameter (1)

SLIDE 9

9 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

Result from adult trial: 𝑠

𝑏𝑒𝑣 = 12 among 𝑜𝑏𝑒𝑣 = 40 ( Ƹ

𝑞𝑏𝑒𝑣 = 0.3) Use an Empirical Bayes power prior approach where መ 𝜀 𝑠

𝑞𝑓𝑒; 𝑠 𝑏𝑒𝑣 = 12 maximizes

the marginal likelihood of 𝜀 (Gravestock, Held et al. 2017):

Adaptive power parameter (2)

መ 𝜀 𝑠

𝑞𝑓𝑒; 𝑠 𝑏𝑒𝑣 = 12 :

SLIDE 10

10 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣, መ

𝜀 𝑠

𝑞𝑓𝑒; 𝑠 𝑏𝑒𝑣

> 𝑑 = 0.95 corresponds to 𝑠

𝑞𝑓𝑒 ≥ 𝑐 = 11

Adaptive power parameter (3)

Without adults

SLIDE 11

11 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣, መ

𝜀 𝑠

𝑞𝑓𝑒; 𝑠 𝑏𝑒𝑣

> 𝑑 = 0.95 corresponds to 𝑠

𝑞𝑓𝑒 ≥ 𝑐 = 11

→ power gain but type I error inflation

Adaptive power parameter (4)

𝑐 = 13 𝑐 = 11

SLIDE 12

12 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣, መ

𝜀 𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣

is monotonically increasing in 𝑠

𝑞𝑓𝑒

→ 𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣, መ

𝜀 > 𝑑′ = 0.99 corresponds to 𝑠

𝑞𝑓𝑒 ≥ 𝑐 = 13

→ type I error controlled but no power gained

Adaptive power parameter (5)

Without adults

SLIDE 13

13 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

“Extreme borrowing” (1)

Artificial method for illustration of not monotonically increasing

𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣 :

borrow adult information ֞ Ƹ 𝑞𝑏𝑒𝑣 = Ƹ 𝑞𝑞𝑓𝑒

Assume 𝑜𝑏𝑒𝑣 = 100, 𝑠

𝑏𝑒𝑣 = 30 ֜ Ƹ

𝑞𝑏𝑒𝑣 = 0.3

Here:

borrow all adult information if Ƹ 𝑞𝑞𝑓𝑒 = 0.3 (𝑠

𝑞𝑓𝑒 = 12 for 𝑜𝑞𝑓𝑒 = 40 )

don‘t borrow for 𝑠

𝑞𝑓𝑒 ≠ 12

SLIDE 14

14 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

“Extreme borrowing” (2)

Borrow all adult information iff 𝑠

𝑞𝑓𝑒 = 12

For 𝑑 = 0.95 ֜ 𝑐 = 12 ֜ type I error rate = 0.088

Without adults

SLIDE 15

15 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

“Extreme borrowing” (3)

Borrow all adult information iff 𝑠

𝑞𝑓𝑒 = 12

For 𝑑 = 0.95 ֜ 𝑐 = 12 ֜ type I error rate = 0.088 For 𝑑 = 0.9976 ֜ reject H0 if 𝑠

𝑞𝑓𝑒 = 12 or 𝑠 𝑞𝑓𝑒 ≥ 16

֜ type I error rate = 0.047

Without adults

SLIDE 16

16 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

“Extreme borrowing” (4)

Reject H0 if 𝑠

𝑞𝑓𝑒 ∈ 12 ∪ 16, 17, … , 40

Compare to: Reject H0 if 𝑠

𝑞𝑓𝑒 ∈ 13, 14, … , 40

→ type I error controlled but power decreased

SLIDE 17

17 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

If 𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣 is monotonically increasing in 𝑠 𝑞𝑓𝑒,

then there exists 𝑑′ with 𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣 ≥ 𝑑′ ֞ 𝑠 𝑞𝑓𝑒 ≥ 𝑐UMP 𝛽

and 𝑐UMP 𝛽 is the level 𝛽 UMP test boundary.

Borrowing from adult information (1)

SLIDE 18

18 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

If 𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣

is not monotonically increasing in 𝑠

𝑞𝑓𝑒, then either:

(1) a threshold 𝑑′ can still be identified with

𝑄 𝑞 > 𝑞0|𝑠

𝑞𝑓𝑒, 𝑠 𝑏𝑒𝑣 ≥ 𝑑′֞𝑠 𝑞𝑓𝑒 ≥ 𝑐UMP 𝛽 (∗)

(2) if no 𝑑′ with (∗) can be identified, then either the

test does not control type I error
r
test controls type I error but is not UMP.

Borrowing from adult information (2)

SLIDE 19

19 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

View decision rule in Bayesian approach as test function φ 𝑠

𝑞𝑓𝑒 = 1 𝑄 𝑞>𝑞0|𝑠𝑞𝑓𝑒,𝑠𝑏𝑒𝑣 ≥𝑑

→ There is nothing better than the UMP test!

This holds for all situations in which UMP tests exist:

exponential family distribution

ne-sided tests, two-sided tests (equivalence situation)
ne-sided comparison of two means of normal variables …
This also holds in situations in which UMP unbiased tests exists:

two-sided comparisons comparison of two proportions …

True for any (adaptive) borrowing mechanism (power prior, mixture prior,

hierarchical model, test-then-pool,…) (see Viele et al. (2014))

Proven by Psioda and Ibrahim (2018) for one-sample one-sided normal test with

borrowing using a fixed power prior.

Borrowing from adult information: Summary

SLIDE 20

20 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

𝑒𝐷 : realizations of current data 𝐸𝐷 collected to test: ϑ𝐷 ∈ 𝐼0 vs. ϑ𝐷 ∈ 𝐼0
Without historical data:

Lehmann (1986) notation: the UMP hypothesis test is (𝑈 sufficient test statistic) 𝜒 𝑒𝐷 = ቊ 1 if 𝑈 𝑒𝐷 ∈ RejectionRegion (reject 𝐼0) if 𝑈 𝑒𝐷 ∈ AcceptanceRegion (accept 𝐼0) → power function 𝐹𝜘𝐷 𝜒 𝐸𝐷 → type I error control: 𝐹𝜘𝐷 𝜒 𝐸𝐷 ≤ 𝛽 for all 𝜘𝐷 ∈ 𝐼0

With historical data:

Borrow from observed historical data 𝑒𝐼 (from 𝐸𝐼) by: 𝜒𝐶 𝑒𝐷; 𝑒𝐼 = ቊ 1 if 𝑈 𝑒𝐷 ∈ RejectionRegion 𝑒𝐼 if 𝑈 𝑒𝐷 ∈ AcceptanceRegion 𝑒𝐼 → power function 𝐹𝜘𝐷 𝜒𝐶 𝐸𝐷; 𝑒𝐼 = 𝐹𝜘𝐷,𝜘𝐼 | 𝜒𝐶 𝐸𝐷; 𝐸𝐼 𝐸𝐼 = 𝑒𝐼 → type I error: max

ϑ𝐷∈𝐼0 𝐹𝜘𝐷 𝜒𝐶 𝐸𝐷; 𝑒𝐼

(note: 𝜘𝐷 may be multidimensional)

In general

SLIDE 21

21 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

For frequentist characteristics: interest in power function

𝐹𝜘𝐷 𝜒𝐶 𝐸𝐷; 𝑒𝐼 = 𝐹𝜘𝐷,𝜘𝐼 | 𝜒𝐶 𝐸𝐷; 𝐸𝐼 𝐸𝐼 = 𝑒𝐼

But: fixing 𝑒𝐼 may be perceived not objective enough since individual case study
Cave:

Simulating 𝑒𝐷, 𝑒𝐼 (according to 𝜘𝐷, 𝜘𝐼 ) and evaluating 𝜒𝐶 𝑒𝐷; 𝑒𝐼 → 𝐹𝜘𝐷,𝜘𝐼 𝜒𝐶 𝐸𝐷; 𝐸𝐼 but 𝐹𝜘𝐷,𝜘𝐼 𝜒𝐶 𝐸𝐷; 𝐸𝐼 ≠ 𝐹𝜘𝐷,𝜘𝐼 | 𝜒𝐶 𝐸𝐷; 𝐸𝐼 𝐸𝐼 = 𝑒𝐼

Simulating operating characteristics of borrowing methods (1)

SLIDE 22

22 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

Proposals A (1) Simulate 𝑒𝐼 (according to 𝜘𝐼) (2) Repeatedly simulate 𝑒𝐷 (according to 𝜘𝐷) → evaluate 𝐹𝜘𝐷 𝜒𝐶 𝐸𝐷; 𝑒𝐼 (3) Calculate type I error: max

ϑ𝐷∈𝐼0 𝐹𝜘𝐷 𝜒𝐶 𝐸𝐷; 𝑒𝐼

= 𝛽𝑒𝐼 (4) Compare to power function of level 𝛽𝑒𝐼 test w/o borrowing (𝐹𝜘𝐷 𝜒𝑒𝐼 𝐸𝐷 ): 𝐹𝜘𝐷 𝜒𝐶 𝐸𝐷; 𝑒𝐼 − 𝐹𝜘𝐷 𝜒𝑒𝐼 𝐸𝐷 (5) Repeat (1) - (4) (6) Report 𝐹𝜘𝐼 𝐹𝜘𝐷 𝜒𝐶 𝐸𝐷; 𝑒𝐼 − 𝐹𝜘𝐷 𝜒𝑒𝐼 𝐸𝐷 B Show relationship: 𝑒𝐼 ↔ 𝛽𝑒𝐼

Simulating operating characteristics of borrowing methods (2)

SLIDE 23

23 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

If type I error control is desired in a situation where a UMP (unbiased) test

exists, external information is effectively discarded.

For a given historical data setting,

choose from the available power functions for current data.

If prior information is reliable and consistent with the current information, the

final operating characteristics of the trial can be improved: increased power or lower type I error, depending on where prior information is placed (but at expense of the other characteristic). → Incorporation of prior information can give a rationale for type I error inflation with benefit of a power gain, amount of type I error inflation reflects degree of reliance on prior information.

Conclusion

SLIDE 24

24 16 Sept 2020 Annette Kopp-Schneider CEN Satellite Webinar

Gravestock I, Held L; COMBACTE-Net consortium (2017). Adaptive power priors

with empirical Bayes for clinical trials. Pharmaceutical Statistics 16(5): 349-360.

Kopp-Schneider A, Calderazzo S, Wiesenfarth M. (2020) Power gains by using

external information in clinical trials are typically not possible when requiring strict type I error control. Biometrical Journal 62(2): 361-374.

Lehmann E (1986). Testing statistical hypotheses (2nd ed.). Wiley Series in

Probability and Statistics. New York: John Wiley & Sons.

Psioda MA, Ibrahim JG (2018) Bayesian clinical trial design using historical data

that inform the treatment effect. Biostatistics 20(3): 400-415.

Viele K, Berry S, Neuenschwander B, et al. (2014) Use of historical control data for