Hypotheses with two variates Two sample hypotheses R.W. Oldford - - PowerPoint PPT Presentation

▶

Dec 26, 2023 501 likes •1.63k views

Hypotheses with two variates Two sample hypotheses R.W. Oldford Common hypotheses Recall some common circumstances and hypotheses: Hypotheses about the distribution of a single random variable Y for which we have sample values y 1 , y 2 , .

SLIDE 1

Hypotheses with two variates

Two sample hypotheses R.W. Oldford

SLIDE 2

Common hypotheses

Recall some common circumstances and hypotheses:

◮ Hypotheses about the distribution of a single random variable Y for which

we have sample values y1, y2, . . . yn.

SLIDE 3

Common hypotheses

Recall some common circumstances and hypotheses:

◮ Hypotheses about the distribution of a single random variable Y for which

we have sample values y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µ = E(Y ) = µ0 for some specified value µ0.

SLIDE 4

Common hypotheses

Recall some common circumstances and hypotheses:

◮ Hypotheses about the distribution of a single random variable Y for which

we have sample values y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µ = E(Y ) = µ0 for some specified value µ0. or ◮ H0 : Y ∼ FY (y) for some specified (possibly only up to some point)

distribution FY (y)

◮ Hypotheses about the similarity of the distributions of two random variables

X and Y for which we have a sample from each, namely x1, x2, . . . , xm and y1, y2, . . . yn.

SLIDE 5

Common hypotheses

Recall some common circumstances and hypotheses:

◮ Hypotheses about the distribution of a single random variable Y for which

we have sample values y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µ = E(Y ) = µ0 for some specified value µ0. or ◮ H0 : Y ∼ FY (y) for some specified (possibly only up to some point)

distribution FY (y)

◮ Hypotheses about the similarity of the distributions of two random variables

X and Y for which we have a sample from each, namely x1, x2, . . . , xm and y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µY = µX ; the distributions match on some key features.

SLIDE 6

Common hypotheses

Recall some common circumstances and hypotheses:

◮ Hypotheses about the distribution of a single random variable Y for which

we have sample values y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µ = E(Y ) = µ0 for some specified value µ0. or ◮ H0 : Y ∼ FY (y) for some specified (possibly only up to some point)

distribution FY (y)

◮ Hypotheses about the similarity of the distributions of two random variables

X and Y for which we have a sample from each, namely x1, x2, . . . , xm and y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µY = µX ; the distributions match on some key features. (e.g. they share

a specified distribution family, like N(µ, σ2), parameterized by these features.)

SLIDE 7

Common hypotheses

Recall some common circumstances and hypotheses:

◮ Hypotheses about the distribution of a single random variable Y for which

we have sample values y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µ = E(Y ) = µ0 for some specified value µ0. or ◮ H0 : Y ∼ FY (y) for some specified (possibly only up to some point)

distribution FY (y)

◮ Hypotheses about the similarity of the distributions of two random variables

X and Y for which we have a sample from each, namely x1, x2, . . . , xm and y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µY = µX ; the distributions match on some key features. (e.g. they share

a specified distribution family, like N(µ, σ2), parameterized by these features.)

◮ H0 : FY (y) = FX(x) without specifying either F()

SLIDE 8

Common hypotheses

Recall some common circumstances and hypotheses:

◮ Hypotheses about the distribution of a single random variable Y for which

we have sample values y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µ = E(Y ) = µ0 for some specified value µ0. or ◮ H0 : Y ∼ FY (y) for some specified (possibly only up to some point)

distribution FY (y)

◮ Hypotheses about the similarity of the distributions of two random variables

X and Y for which we have a sample from each, namely x1, x2, . . . , xm and y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µY = µX ; the distributions match on some key features. (e.g. they share

a specified distribution family, like N(µ, σ2), parameterized by these features.)

◮ H0 : FY (y) = FX(x) without specifying either F()

◮ Hypotheses about the joint distribution of X and Y for which we have

sample pairs of values (x1, y1), (x2, y2), . . . , (xn, yn).

SLIDE 9

Common hypotheses

Recall some common circumstances and hypotheses:

◮ Hypotheses about the distribution of a single random variable Y for which

we have sample values y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µ = E(Y ) = µ0 for some specified value µ0. or ◮ H0 : Y ∼ FY (y) for some specified (possibly only up to some point)

distribution FY (y)

◮ Hypotheses about the similarity of the distributions of two random variables

X and Y for which we have a sample from each, namely x1, x2, . . . , xm and y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µY = µX ; the distributions match on some key features. (e.g. they share

a specified distribution family, like N(µ, σ2), parameterized by these features.)

◮ H0 : FY (y) = FX(x) without specifying either F()

◮ Hypotheses about the joint distribution of X and Y for which we have

sample pairs of values (x1, y1), (x2, y2), . . . , (xn, yn). Common hypotheses include:

◮ H0 : ρXY = 0; that is, X and Y are uncorrelated

SLIDE 10

Common hypotheses

Recall some common circumstances and hypotheses:

◮ Hypotheses about the distribution of a single random variable Y for which

we have sample values y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µ = E(Y ) = µ0 for some specified value µ0. or ◮ H0 : Y ∼ FY (y) for some specified (possibly only up to some point)

distribution FY (y)

◮ Hypotheses about the similarity of the distributions of two random variables

X and Y for which we have a sample from each, namely x1, x2, . . . , xm and y1, y2, . . . yn. Common hypotheses include:

◮ H0 : µY = µX ; the distributions match on some key features. (e.g. they share

a specified distribution family, like N(µ, σ2), parameterized by these features.)

◮ H0 : FY (y) = FX(x) without specifying either F()

◮ Hypotheses about the joint distribution of X and Y for which we have

sample pairs of values (x1, y1), (x2, y2), . . . , (xn, yn). Common hypotheses include:

◮ H0 : ρXY = 0; that is, X and Y are uncorrelated ◮ H0 : FX,Y (x, y) = FX (x) × FY (y); that is, X and Y are independent (written

X⊥ ⊥Y ).

SLIDE 11

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn.

SLIDE 12

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively.

SLIDE 13

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively. Suppose we are interested in the hypothesis H0 : E(X) = E(Y ).

SLIDE 14

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively. Suppose we are interested in the hypothesis H0 : E(X) = E(Y ).

◮ we need a discrepancy measure;

SLIDE 15

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively. Suppose we are interested in the hypothesis H0 : E(X) = E(Y ).

◮ we need a discrepancy measure; how about

d = | x − y |?

SLIDE 16

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively. Suppose we are interested in the hypothesis H0 : E(X) = E(Y ).

◮ we need a discrepancy measure; how about

d = | x − y |? This may or may not be the best discrepancy measure but unusually large values would be evidence against the hypothesis.

◮ we also need to be able to generate new realizations xi and yi when H0

holds.

SLIDE 17

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively. Suppose we are interested in the hypothesis H0 : E(X) = E(Y ).

◮ we need a discrepancy measure; how about

d = | x − y |? This may or may not be the best discrepancy measure but unusually large values would be evidence against the hypothesis.

◮ we also need to be able to generate new realizations xi and yi when H0

holds. This requires a little more thinking.

SLIDE 18

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively. Suppose we are interested in the hypothesis H0 : E(X) = E(Y ).

◮ we need a discrepancy measure; how about

d = | x − y |? This may or may not be the best discrepancy measure but unusually large values would be evidence against the hypothesis.

◮ we also need to be able to generate new realizations xi and yi when H0

holds. This requires a little more thinking.

How to generate the data under H0 is is not obvious.

SLIDE 19

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively. Suppose we are interested in the hypothesis H0 : E(X) = E(Y ).

◮ we need a discrepancy measure; how about

d = | x − y |? This may or may not be the best discrepancy measure but unusually large values would be evidence against the hypothesis.

◮ we also need to be able to generate new realizations xi and yi when H0

holds. This requires a little more thinking.

How to generate the data under H0 is is not obvious. Suppose we assume that both X and Y are continuous random variables.

SLIDE 20

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively. Suppose we are interested in the hypothesis H0 : E(X) = E(Y ).

◮ we need a discrepancy measure; how about

d = | x − y |? This may or may not be the best discrepancy measure but unusually large values would be evidence against the hypothesis.

◮ we also need to be able to generate new realizations xi and yi when H0

holds. This requires a little more thinking.

How to generate the data under H0 is is not obvious. Suppose we assume that both X and Y are continuous random variables.

◮ If we had QX(u), say for all u ∈ (0, 1), then

SLIDE 21

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively. Suppose we are interested in the hypothesis H0 : E(X) = E(Y ).

◮ we need a discrepancy measure; how about

d = | x − y |? This may or may not be the best discrepancy measure but unusually large values would be evidence against the hypothesis.

◮ we also need to be able to generate new realizations xi and yi when H0

holds. This requires a little more thinking.

How to generate the data under H0 is is not obvious. Suppose we assume that both X and Y are continuous random variables.

◮ If we had QX(u), say for all u ∈ (0, 1), then we could generate new xis by

generating uniform U(0, 1) realizations and letting xi = QX(ui); similarly for Y .

SLIDE 22

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively. Suppose we are interested in the hypothesis H0 : E(X) = E(Y ).

◮ we need a discrepancy measure; how about

d = | x − y |? This may or may not be the best discrepancy measure but unusually large values would be evidence against the hypothesis.

◮ we also need to be able to generate new realizations xi and yi when H0

holds. This requires a little more thinking.

How to generate the data under H0 is is not obvious. Suppose we assume that both X and Y are continuous random variables.

◮ If we had QX(u), say for all u ∈ (0, 1), then we could generate new xis by

generating uniform U(0, 1) realizations and letting xi = QX(ui); similarly for Y . We could use the observed values to produce estimates QX(u) and use these instead.

SLIDE 23

Common hypotheses - X and Y

Suppose we have two univariate samples: x1, x2, . . . , xm and y1, y2, . . . yn. We model these as each being independent realizations of some univariate random variable, say X and Y respectively. Suppose we are interested in the hypothesis H0 : E(X) = E(Y ).

◮ we need a discrepancy measure; how about

d = | x − y |? This may or may not be the best discrepancy measure but unusually large values would be evidence against the hypothesis.

◮ we also need to be able to generate new realizations xi and yi when H0

holds. This requires a little more thinking.

How to generate the data under H0 is is not obvious. Suppose we assume that both X and Y are continuous random variables.

◮ If we had QX(u), say for all u ∈ (0, 1), then we could generate new xis by

generating uniform U(0, 1) realizations and letting xi = QX(ui); similarly for Y . We could use the observed values to produce estimates QX(u) and use these instead.

◮ But we still need to ensure that E(X) = E(Y ) for the generated data!

SLIDE 24

Common hypotheses - X and Y

We might generate the new xis from an estimated quantile function QX(u), but

ne whose mean we have adjusted to be the same as that of the estimated

quantile function QY (u).

SLIDE 25

Common hypotheses - X and Y

We might generate the new xis from an estimated quantile function QX(u), but

ne whose mean we have adjusted to be the same as that of the estimated

quantile function QY (u). How about

generateFromMeanShiftData <- function(data, size = length(data), mu = 0) { dataWithMeanMu <- data - mean(data) + mu quantile(dataWithMeanMu, probs = runif(size)) }

SLIDE 26

Common hypotheses - X and Y

We might generate the new xis from an estimated quantile function QX(u), but

ne whose mean we have adjusted to be the same as that of the estimated

quantile function QY (u). How about

generateFromMeanShiftData <- function(data, size = length(data), mu = 0) { dataWithMeanMu <- data - mean(data) + mu quantile(dataWithMeanMu, probs = runif(size)) }

Together with

discrepancyMeans <- function(data) { # assume data is a list of named components x and y x <- data$x y <- data$y abs(mean(x) - mean(y)) }

we should be able to get a test of the hypothesis that could at least guide our exploration.

SLIDE 27

Common hypotheses - X and Y

First a function to get new x and y realizations simultaneously:

# First we need a function to generate new x and y at once generateXYcommonMean <- function(data){ # assume data is a list containing x and y x <- data$x y <- data$y # return new data as a list of x and y newx <- generateFromMeanShiftData(x) newy <- generateFromMeanShiftData(y) list(x = newx, y = newy) }

SLIDE 28

Common hypotheses - X and Y

We put these pieces together to test our hypothesis H0 : E(X) = E(Y ).

SLIDE 29

Common hypotheses - X and Y

We put these pieces together to test our hypothesis H0 : E(X) = E(Y ). Assuming all of the previous functions, we can write a test function more generally as

numericalTestXY <- function(x, y = NULL, nReps = 2000, discrepancyFn = discrepancyMeans, generateXY = generateXYcommonMean, returnStats = FALSE){ data <- if(is.null(y)) { list(x = x[[1]], y = x[[2]]) } else {list(x = x, y = y)} dObserved <- discrepancyFn(data) discrepancies <- sapply(1:nReps, FUN = function(i){ newData <- generateXY(data) discrepancyFn(newData) } ) result <- mean(discrepancies >= dObserved) if (returnStats){ result <- list(p = result, dobs = dObserved, d = discrepancies)} result }

SLIDE 30

Common hypotheses - X and Y

We put these pieces together to test our hypothesis H0 : E(X) = E(Y ). Assuming all of the previous functions, we can write a test function more generally as

numericalTestXY <- function(x, y = NULL, nReps = 2000, discrepancyFn = discrepancyMeans, generateXY = generateXYcommonMean, returnStats = FALSE){ data <- if(is.null(y)) { list(x = x[[1]], y = x[[2]]) } else {list(x = x, y = y)} dObserved <- discrepancyFn(data) discrepancies <- sapply(1:nReps, FUN = function(i){ newData <- generateXY(data) discrepancyFn(newData) } ) result <- mean(discrepancies >= dObserved) if (returnStats){ result <- list(p = result, dobs = dObserved, d = discrepancies)} result }

Note that, as written, this is not peculiar to the hypothesis H0 : E(X) = E(Y ).

SLIDE 31

Common hypotheses - X and Y

We put these pieces together to test our hypothesis H0 : E(X) = E(Y ). Assuming all of the previous functions, we can write a test function more generally as

numericalTestXY <- function(x, y = NULL, nReps = 2000, discrepancyFn = discrepancyMeans, generateXY = generateXYcommonMean, returnStats = FALSE){ data <- if(is.null(y)) { list(x = x[[1]], y = x[[2]]) } else {list(x = x, y = y)} dObserved <- discrepancyFn(data) discrepancies <- sapply(1:nReps, FUN = function(i){ newData <- generateXY(data) discrepancyFn(newData) } ) result <- mean(discrepancies >= dObserved) if (returnStats){ result <- list(p = result, dobs = dObserved, d = discrepancies)} result }

Note that, as written, this is not peculiar to the hypothesis H0 : E(X) = E(Y ). It should work for

ther numerical tests provided it was handed the correct functions.

SLIDE 32

Common hypotheses - X and Y

We can try it out on a few examples.

SLIDE 33

Common hypotheses - X and Y

We can try it out on a few examples. x <- rnorm(35) y <- rnorm(45, mean = 0.1) numericalTestXY(x,y) ## [1] 0.1775 y <- rnorm(45, mean = 0.5) numericalTestXY(x,y) ## [1] 0.002 y <- y + 0.01 numericalTestXY(x,y) ## [1] 5e-04 x <- rnorm(35, sd = 2) y <- rnorm(45, mean = 0.1) numericalTestXY(x,y) ## [1] 0.616

SLIDE 34

Common hypotheses - X and Y

We can try it out on some real data.

SLIDE 35

Common hypotheses - X and Y

We can try it out on some real data. First on comparing average weights of cars.

# Compare average weights boxplot(wt ~ am, data = mtcars, col = "grey")

1 2 3 4 5 am wt

automatic <- mtcars$am == 0 manual <- !automatic numericalTestXY(mtcars$wt[automatic], mtcars$wt[manual]) ## [1] 0

SLIDE 36

Common hypotheses - X and Y

Now on comparing average time to reach a quarter mile.

# Compare average time to a quarter mile boxplot(qsec ~ am, data = mtcars, col = "grey")

1 16 18 20 22 am qsec

result <- numericalTestXY(mtcars$qsec[automatic], mtcars$qsec[manual], returnStats = TRUE) result$p ## [1] 0.1575

SLIDE 37

Common hypotheses - X and Y

Can get a histogram of all the discrepancies

# Compare average weights hist(result$d, col = "grey", xlab = "discrepancy", main = "Testing mean qsec by am") abline(v = result$dobs, lty =2, lwd = 2, col = "red") legend("topright", legend = c(paste0("discrepancy is ", round(result$dobs,3)), paste0("p-value is ", round(result$p,3))))

SLIDE 38

Common hypotheses - X and Y

Can get a histogram of all the discrepancies

# Compare average weights hist(result$d, col = "grey", xlab = "discrepancy", main = "Testing mean qsec by am") abline(v = result$dobs, lty =2, lwd = 2, col = "red") legend("topright", legend = c(paste0("discrepancy is ", round(result$dobs,3)), paste0("p-value is ", round(result$p,3))))

Testing mean qsec by am

discrepancy Frequency 0.0 0.5 1.0 1.5 2.0 100 200 300 400 500 discrepancy is 0.823 p−value is 0.158

SLIDE 39

Same with pictures

Could the same hypothesis be tested using a graphical discrepancy measure and a line up test?

SLIDE 40

Same with pictures

Could the same hypothesis be tested using a graphical discrepancy measure and a line up test? Here’s one graphical discrepancy measure:

compareBoxplots <- function(data, subjectNo) { y <- data$y # assume data is a list of x and y x <- data$x group <- factor(c(rep("x", length(x)), rep("y", length(y)))) # create data for boxplots bp_data <- data.frame(vals = c(x,y), group = group) boxplot(vals ~ group, data = bp_data, main = paste(subjectNo), cex.main = 4, xlab = "", ylab = "", col = adjustcolor(c("firebrick", "steelblue"), 0.5), xaxt = "n", yaxt = "n" # turn off axes ) } Note:

◮ ‘data‘ is now a list of two named vectors “x‘ and ‘y‘, and ◮ boxplots are to be compared only by difference in locations, nothing else.

SLIDE 41

The lineup tests with boxplots

To test H0 : E(X) = E(Y ) with boxplots, we need only generate the new xs and ys as before. data <- list(x = mtcars$wt[automatic], y = mtcars$wt[manual]) lineup(data, generateSubject = generateXYcommonMean, showSubject = compareBoxplots, layout=c(4, 5))

SLIDE 42

The lineup tests with boxplots

To test H0 : E(X) = E(Y ) with boxplots, we need only generate the new xs and ys as before. data <- list(x = mtcars$wt[automatic], y = mtcars$wt[manual]) lineup(data, generateSubject = generateXYcommonMean, showSubject = compareBoxplots, layout=c(4, 5)) Notes:

◮ The lineup() function was not changed.

SLIDE 43

The lineup tests with boxplots

To test H0 : E(X) = E(Y ) with boxplots, we need only generate the new xs and ys as before. data <- list(x = mtcars$wt[automatic], y = mtcars$wt[manual]) lineup(data, generateSubject = generateXYcommonMean, showSubject = compareBoxplots, layout=c(4, 5)) Notes:

◮ The lineup() function was not changed. ◮ The lineup test looks very much like its numerical counterpart,

SLIDE 44

The lineup tests with boxplots

To test H0 : E(X) = E(Y ) with boxplots, we need only generate the new xs and ys as before. data <- list(x = mtcars$wt[automatic], y = mtcars$wt[manual]) lineup(data, generateSubject = generateXYcommonMean, showSubject = compareBoxplots, layout=c(4, 5)) Notes:

◮ The lineup() function was not changed. ◮ The lineup test looks very much like its numerical counterpart, ◮ instead of a discrepancyFn function it has a showSubject function.

The logic is identical.

SLIDE 45

A line up test.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(2.7810636828402e+77, base=33) - 39

SLIDE 46

A line up test.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(2.7810636828402e+77, base=33) - 39 = 12

SLIDE 47

The lineup tests with boxplots - qsec

Test H0 : E(X) = E(Y ) with qsec. data <- list(x = mtcars$qsec[automatic], y = mtcars$qsec[manual]) lineup(data, generateSubject = generateXYcommonMean, showSubject = compareBoxplots, layout=c(4, 5))

SLIDE 48

A line up test.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(1.27173474825649e+90, base=30) - 43

SLIDE 49

A line up test.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(1.27173474825649e+90, base=30) - 43 = 18

SLIDE 50

Overlaid quantile plots as discrepancy

Or, we could test H0 : E(X) = E(Y ) using a different graphical discrepancy measure, say overlaid quantile plots.

compareQuantiles <- function(data, subjectNo) { y <- sort(data$y) # assume data is a list of x and y x <- sort(data$x) ylim <- extendrange(c(x,y)) n <- length(y) m <- length(x) py <- ppoints(n) px <- ppoints(m) plot(px, x, ylim = ylim, xlim = c(0,1), main = paste(subjectNo), cex.main = 4, xlab = "", ylab = "", col = adjustcolor("firebrick", 0.75), pch = 17, cex = 4, xaxt = "n", yaxt = "n" # turn off axes ) points(py, y, col = adjustcolor("steelblue", 0.75), pch = 19, cex = 4) } Note:

◮ quantiles are to be compared only by difference in locations, that is height.

SLIDE 51

A line up test.

Test H0 : E(X) = E(Y ), for wt data <- list(x = mtcars$wt[automatic], y = mtcars$wt[manual]) lineup(data, generateSubject = generateXYcommonMean, showSubject = compareQuantiles, layout=c(4, 5))

SLIDE 52

A line up test.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(8.59504455717143e+63, base=9) - 64

SLIDE 53

A line up test.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(8.59504455717143e+63, base=9) - 64 = 3

SLIDE 54

A line up test.

Test H0 : E(X) = E(Y ), for qsec data <- list(x = mtcars$qsec[automatic], y = mtcars$qsec[manual]) lineup(data, generateSubject = generateXYcommonMean, showSubject = compareQuantiles, layout=c(4, 5))

SLIDE 55

A line up test.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(4.74284397516047e+80, base=16) - 60

SLIDE 56

A line up test.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(4.74284397516047e+80, base=16) - 60 = 7

SLIDE 57

A more general hypothesis

What about the more general hypothesis involving two samples?

SLIDE 58

A more general hypothesis

What about the more general hypothesis involving two samples? Namely, H0 : FY (y) = FX(x) for unknown FX(x) and FY (y).

SLIDE 59

A more general hypothesis

What about the more general hypothesis involving two samples? Namely, H0 : FY (y) = FX(x) for unknown FX(x) and FY (y). The graphical discrepancies previously used would still work.

SLIDE 60

A more general hypothesis

What about the more general hypothesis involving two samples? Namely, H0 : FY (y) = FX(x) for unknown FX(x) and FY (y). The graphical discrepancies previously used would still work. But what about generating xs and ys when H0 holds?

SLIDE 61

A more general hypothesis

What about the more general hypothesis involving two samples? Namely, H0 : FY (y) = FX(x) for unknown FX(x) and FY (y). The graphical discrepancies previously used would still work. But what about generating xs and ys when H0 holds? If H0 holds, then FY (y) = FX(x) and the xs and ys are simply two different samples from the same distribution.

SLIDE 62

A more general hypothesis

What about the more general hypothesis involving two samples? Namely, H0 : FY (y) = FX(x) for unknown FX(x) and FY (y). The graphical discrepancies previously used would still work. But what about generating xs and ys when H0 holds? If H0 holds, then FY (y) = FX(x) and the xs and ys are simply two different samples from the same distribution. That means we could combine the values x1, . . . , xm and y1, . . . , yn into a single sample of size m + n.

SLIDE 63

A more general hypothesis

What about the more general hypothesis involving two samples? Namely, H0 : FY (y) = FX(x) for unknown FX(x) and FY (y). The graphical discrepancies previously used would still work. But what about generating xs and ys when H0 holds? If H0 holds, then FY (y) = FX(x) and the xs and ys are simply two different samples from the same distribution. That means we could combine the values x1, . . . , xm and y1, . . . , yn into a single sample of size m + n. Label the new sample as z1, . . . , zm, zm+1, . . . , zm+n.

SLIDE 64

A more general hypothesis

What about the more general hypothesis involving two samples? Namely, H0 : FY (y) = FX(x) for unknown FX(x) and FY (y). The graphical discrepancies previously used would still work. But what about generating xs and ys when H0 holds? If H0 holds, then FY (y) = FX(x) and the xs and ys are simply two different samples from the same distribution. That means we could combine the values x1, . . . , xm and y1, . . . , yn into a single sample of size m + n. Label the new sample as z1, . . . , zm, zm+1, . . . , zm+n. The observed xs are simply a sample of m values (without replacement) from the zs; the ys are the remaining unsampled zs.

SLIDE 65

A more general hypothesis

What about the more general hypothesis involving two samples? Namely, H0 : FY (y) = FX(x) for unknown FX(x) and FY (y). The graphical discrepancies previously used would still work. But what about generating xs and ys when H0 holds? If H0 holds, then FY (y) = FX(x) and the xs and ys are simply two different samples from the same distribution. That means we could combine the values x1, . . . , xm and y1, . . . , yn into a single sample of size m + n. Label the new sample as z1, . . . , zm, zm+1, . . . , zm+n. The observed xs are simply a sample of m values (without replacement) from the zs; the ys are the remaining unsampled zs. We could generate new xs and new ys in exactly the same way.

SLIDE 66

A more general hypothesis

What about the more general hypothesis involving two samples? Namely, H0 : FY (y) = FX(x) for unknown FX(x) and FY (y). The graphical discrepancies previously used would still work. But what about generating xs and ys when H0 holds? If H0 holds, then FY (y) = FX(x) and the xs and ys are simply two different samples from the same distribution. That means we could combine the values x1, . . . , xm and y1, . . . , yn into a single sample of size m + n. Label the new sample as z1, . . . , zm, zm+1, . . . , zm+n. The observed xs are simply a sample of m values (without replacement) from the zs; the ys are the remaining unsampled zs. We could generate new xs and new ys in exactly the same way.

Note: there are many other ways that we might use the zs to generate new xs and ys as well.

SLIDE 67

A more general hypothesis - generating new data

Following, the proposed method, we write a new function to generate the data under H0: # First we need a function to generate new x and y at once mixRandomly <- function(data){ # assume data is a list containing x and y x <- data$x y <- data$y m <- length(x) n <- length(y) z <- c(x,y) xIndices <- sample(1:(m+n), m, replace = FALSE) newx <- z[xIndices] newy <- z[-xIndices] # the remainder list(x = newx, y = newy) }

SLIDE 68

A more general hypothesis - lineup test

H0 : FX(x) = FY (y) can now be tested using either graphical discrepancy. With boxplots as lineup(data, generateSubject = mixRandomly, showSubject = compareBoxplots, layout=c(4, 5))

SLIDE 69

A more general hypothesis - lineup test

H0 : FX(x) = FY (y) can now be tested using either graphical discrepancy. With boxplots as lineup(data, generateSubject = mixRandomly, showSubject = compareBoxplots, layout=c(4, 5))

r with quantile plots as

lineup(data, generateSubject = mixRandomly, showSubject = compareQuantiles, layout=c(4, 5)) Only the graphical discrepancy changes.

SLIDE 70

A line up test H0 : FX(x) = FY (y) - wt

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(7.00340620250571e+124, base=22) - 83

SLIDE 71

A line up test H0 : FX(x) = FY (y) - wt

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(7.00340620250571e+124, base=22) - 83 = 10

SLIDE 72

A line up test H0 : FX(x) = FY (y) - wt

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(4.05362155971444e+67, base=7) - 63

SLIDE 73

A line up test H0 : FX(x) = FY (y) - wt

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(4.05362155971444e+67, base=7) - 63 = 17

SLIDE 74

A line up test H0 : FX(x) = FY (y) - qsec

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(2.13598703592091e+96, base=32) - 44

SLIDE 75

A line up test H0 : FX(x) = FY (y) - qsec

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(2.13598703592091e+96, base=32) - 44 = 20

SLIDE 76

A line up test H0 : FX(x) = FY (y) - wt

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(2.15356351058832e+79, base=21) - 49

SLIDE 77

A line up test H0 : FX(x) = FY (y) - wt

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(2.15356351058832e+79, base=21) - 49 = 11

SLIDE 78

A numerical test H0 : FX(x) = FY (y)

A popular numerical test of this hypothesis is the Kolmogorov-Smirnov test which uses the maximum absolute difference between FX(t) and FY (t) over all t.

SLIDE 79

A numerical test H0 : FX(x) = FY (y)

A popular numerical test of this hypothesis is the Kolmogorov-Smirnov test which uses the maximum absolute difference between FX(t) and FY (t) over all t. In R the function ks.test() computes the test and provides the discrepancy measure as its “statistic” and also a p-value usually based on some asymptotic approximation.

SLIDE 80

A numerical test H0 : FX(x) = FY (y)

A popular numerical test of this hypothesis is the Kolmogorov-Smirnov test which uses the maximum absolute difference between FX(t) and FY (t) over all t. In R the function ks.test() computes the test and provides the discrepancy measure as its “statistic” and also a p-value usually based on some asymptotic approximation. We could write a corresponding discrepancy measure for numericalTestXY() as ks.discrep <- function(data) { ks.test(data$x,data$y)$statistic }

SLIDE 81

A numerical test H0 : FX(x) = FY (y)

A popular numerical test of this hypothesis is the Kolmogorov-Smirnov test which uses the maximum absolute difference between FX(t) and FY (t) over all t. In R the function ks.test() computes the test and provides the discrepancy measure as its “statistic” and also a p-value usually based on some asymptotic approximation. We could write a corresponding discrepancy measure for numericalTestXY() as ks.discrep <- function(data) { ks.test(data$x,data$y)$statistic } And use this to estimate significance levels by simulation: numericalTestXY(x = mtcars$qsec[automatic], y = mtcars$qsec[manual], discrepancyFn = ks.discrep, generateXY = mixRandomly) ## [1] 0.2855

SLIDE 82

Different tests for H0 : E(X) = E(Y )

How about using?

discrepancyMedians <- function(data){abs(median(data$x) - median(data$y))}

SLIDE 83

Different tests for H0 : E(X) = E(Y )

How about using?

discrepancyMedians <- function(data){abs(median(data$x) - median(data$y))}

Not the best discrepancy measure for testing differences in means!

SLIDE 84

Different tests for H0 : E(X) = E(Y )

How about using?

discrepancyMedians <- function(data){abs(median(data$x) - median(data$y))}

Not the best discrepancy measure for testing differences in means! Or maybe the usual t-statistic?

SLIDE 85

Different tests for H0 : E(X) = E(Y )

How about using?

discrepancyMedians <- function(data){abs(median(data$x) - median(data$y))}

Not the best discrepancy measure for testing differences in means! Or maybe the usual t-statistic? The discrepancy measure is d = | t | with t = x − y

m + σ2

n

SLIDE 86

Different tests for H0 : E(X) = E(Y )

How about using?

discrepancyMedians <- function(data){abs(median(data$x) - median(data$y))}

Not the best discrepancy measure for testing differences in means! Or maybe the usual t-statistic? The discrepancy measure is d = | t | with t = x − y

m + σ2

n

when σx = σy and both X and Y are normally distributed, we even know the distribution when H0 holds! Even if σx! = σy, it can be approximated.

SLIDE 87

Different tests for H0 : E(X) = E(Y )

How about using?

discrepancyMedians <- function(data){abs(median(data$x) - median(data$y))}

Not the best discrepancy measure for testing differences in means! Or maybe the usual t-statistic? The discrepancy measure is d = | t | with t = x − y

m + σ2

n

when σx = σy and both X and Y are normally distributed, we even know the distribution when H0 holds! Even if σx! = σy, it can be approximated. So we could just use the built-in t.test() function

t.test(x, y) # for different variances, or t.test(x, y, var.equal = TRUE) # for equal variances.

SLIDE 88

Different tests for H0 : E(X) = E(Y )

How about using?

discrepancyMedians <- function(data){abs(median(data$x) - median(data$y))}

Not the best discrepancy measure for testing differences in means! Or maybe the usual t-statistic? The discrepancy measure is d = | t | with t = x − y

m + σ2

n

when σx = σy and both X and Y are normally distributed, we even know the distribution when H0 holds! Even if σx! = σy, it can be approximated. So we could just use the built-in t.test() function

t.test(x, y) # for different variances, or t.test(x, y, var.equal = TRUE) # for equal variances.

Alternatively, if assuming normality seems unjustified, perhaps we might still simulate using the t-statistic?

t.stat <- function(data){abs(t.test(x = data$x, y = data$y)$statistic)}

Note that this statistic was constructed based on normal theory though.

SLIDE 89

Different tests for H0 : E(X) = E(Y )

We could try these out! Here on the qsec variable of mtcars"

x <- mtcars$qsec[automatic] ; y <- mtcars$qsec[manual]

In order of increasing assumptions about X and Y

# Using the absolute difference in means numericalTestXY(x, y, discrepancyFn = discrepancyMeans) ## [1] 0.1505

SLIDE 90

Different tests for H0 : E(X) = E(Y )

We could try these out! Here on the qsec variable of mtcars"

x <- mtcars$qsec[automatic] ; y <- mtcars$qsec[manual]

In order of increasing assumptions about X and Y

# Using the absolute difference in means numericalTestXY(x, y, discrepancyFn = discrepancyMeans) ## [1] 0.1505 # Using the two-sample t-statistic (unequal variances) numericalTestXY(x, y, discrepancyFn = t.stat) ## [1] 0.213

SLIDE 91

Different tests for H0 : E(X) = E(Y )

We could try these out! Here on the qsec variable of mtcars"

x <- mtcars$qsec[automatic] ; y <- mtcars$qsec[manual]

In order of increasing assumptions about X and Y

# Using the absolute difference in means numericalTestXY(x, y, discrepancyFn = discrepancyMeans) ## [1] 0.1505 # Using the two-sample t-statistic (unequal variances) numericalTestXY(x, y, discrepancyFn = t.stat) ## [1] 0.213 # Using the normal theory two-sample t-statistic (unequal variances) t.test(x, y)$p.value ## [1] 0.2093498

SLIDE 92

Different tests for H0 : E(X) = E(Y )

We could try these out! Here on the qsec variable of mtcars"

x <- mtcars$qsec[automatic] ; y <- mtcars$qsec[manual]

In order of increasing assumptions about X and Y

# Using the absolute difference in means numericalTestXY(x, y, discrepancyFn = discrepancyMeans) ## [1] 0.1505 # Using the two-sample t-statistic (unequal variances) numericalTestXY(x, y, discrepancyFn = t.stat) ## [1] 0.213 # Using the normal theory two-sample t-statistic (unequal variances) t.test(x, y)$p.value ## [1] 0.2093498 # Using the normal theory two-sample t-statistic (pooled variances) t.test(x, y, var.equal = TRUE)$p.value ## [1] 0.2056621

SLIDE 93

Different tests for H0 : E(X) = E(Y )

We could try these out! Here on the qsec variable of mtcars"

x <- mtcars$qsec[automatic] ; y <- mtcars$qsec[manual]

In order of increasing assumptions about X and Y

# Using the absolute difference in means numericalTestXY(x, y, discrepancyFn = discrepancyMeans) ## [1] 0.1505 # Using the two-sample t-statistic (unequal variances) numericalTestXY(x, y, discrepancyFn = t.stat) ## [1] 0.213 # Using the normal theory two-sample t-statistic (unequal variances) t.test(x, y)$p.value ## [1] 0.2093498 # Using the normal theory two-sample t-statistic (pooled variances) t.test(x, y, var.equal = TRUE)$p.value ## [1] 0.2056621 # And finally, using the absolute difference MEDIANS numericalTestXY(x, y, discrepancyFn = discrepancyMedians) ## [1] 0.4255

SLIDE 94

Different tests for H0 : E(X) = E(Y )

We could try these out! Here on the qsec variable of mtcars"

x <- mtcars$qsec[automatic] ; y <- mtcars$qsec[manual]

In order of increasing assumptions about X and Y

# Using the absolute difference in means numericalTestXY(x, y, discrepancyFn = discrepancyMeans) ## [1] 0.1505 # Using the two-sample t-statistic (unequal variances) numericalTestXY(x, y, discrepancyFn = t.stat) ## [1] 0.213 # Using the normal theory two-sample t-statistic (unequal variances) t.test(x, y)$p.value ## [1] 0.2093498 # Using the normal theory two-sample t-statistic (pooled variances) t.test(x, y, var.equal = TRUE)$p.value ## [1] 0.2056621 # And finally, using the absolute difference MEDIANS numericalTestXY(x, y, discrepancyFn = discrepancyMedians) ## [1] 0.4255

Be careful. Recommendation here is discrepancyMeans (based on fewest assumptions).

SLIDE 95

On numerical tests

When we constucted the function numericalTestXY() we were imagining the two sample problem. As with the lineup() test function, it might be better to write a more general numericalTest() function that would work more broadly.

SLIDE 96

On numerical tests

When we constucted the function numericalTestXY() we were imagining the two sample problem. As with the lineup() test function, it might be better to write a more general numericalTest() function that would work more broadly. For example,

numericalTest <- function(data, nReps = 2000, discrepancyFn, generateFn, returnStats = FALSE){ dObserved <- discrepancyFn(data) discrepancies <- sapply(1:nReps, FUN = function(i){ newData <- generateFn(data) discrepancyFn(newData) } ) result <- mean(discrepancies >= dObserved) if (returnStats){ result <- list(p = result, dobs = dObserved, d = discrepancies)} result }

SLIDE 97

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance.

SLIDE 98

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

SLIDE 99

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

◮ the nature of the data,

SLIDE 100

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

◮ the nature of the data, ◮ the hypothesis being tested,

SLIDE 101

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

◮ the nature of the data, ◮ the hypothesis being tested, ◮ or the discrepancy measure being used.

SLIDE 102

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

◮ the nature of the data, ◮ the hypothesis being tested, ◮ or the discrepancy measure being used.

◮ providing the correct functions for the arguments discrepancyFn and

generateFn is the responsibility of the user

SLIDE 103

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

◮ the nature of the data, ◮ the hypothesis being tested, ◮ or the discrepancy measure being used.

◮ providing the correct functions for the arguments discrepancyFn and

generateFn is the responsibility of the user

◮ as with lineup’s generateSubject, the most important of these is

generateFn

SLIDE 104

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

◮ the nature of the data, ◮ the hypothesis being tested, ◮ or the discrepancy measure being used.

◮ providing the correct functions for the arguments discrepancyFn and

generateFn is the responsibility of the user

◮ as with lineup’s generateSubject, the most important of these is

generateFn

◮ as with lineup’s showSubject, some discrepancyFn values will be better

than others for a given H0

SLIDE 105

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

◮ the nature of the data, ◮ the hypothesis being tested, ◮ or the discrepancy measure being used.

◮ providing the correct functions for the arguments discrepancyFn and

generateFn is the responsibility of the user

◮ as with lineup’s generateSubject, the most important of these is

generateFn

◮ as with lineup’s showSubject, some discrepancyFn values will be better

than others for a given H0

◮ get either wrong and the test can become nonsensical.

SLIDE 106

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

◮ the nature of the data, ◮ the hypothesis being tested, ◮ or the discrepancy measure being used.

◮ providing the correct functions for the arguments discrepancyFn and

generateFn is the responsibility of the user

◮ as with lineup’s generateSubject, the most important of these is

generateFn

◮ as with lineup’s showSubject, some discrepancyFn values will be better

than others for a given H0

◮ get either wrong and the test can become nonsensical.

SLIDE 107

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

◮ the nature of the data, ◮ the hypothesis being tested, ◮ or the discrepancy measure being used.

◮ providing the correct functions for the arguments discrepancyFn and

generateFn is the responsibility of the user

◮ as with lineup’s generateSubject, the most important of these is

generateFn

◮ as with lineup’s showSubject, some discrepancyFn values will be better

than others for a given H0

◮ get either wrong and the test can become nonsensical.

WARNING: in most cases getting generateFn right is non-trivial and generally requires considerable statistical expertise.

SLIDE 108

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

◮ the nature of the data, ◮ the hypothesis being tested, ◮ or the discrepancy measure being used.

◮ providing the correct functions for the arguments discrepancyFn and

generateFn is the responsibility of the user

◮ as with lineup’s generateSubject, the most important of these is

generateFn

◮ as with lineup’s showSubject, some discrepancyFn values will be better

than others for a given H0

◮ get either wrong and the test can become nonsensical.

WARNING: in most cases getting generateFn right is non-trivial and generally requires considerable statistical expertise. Much statistical research is focussed on this question!

SLIDE 109

On numerical tests

The advantage of writing the general numericalTest() function is that the code exactly mirrors the structure of a test of significance. Notes:

◮ the structure of numericalTest() makes no assumption about

◮ the nature of the data, ◮ the hypothesis being tested, ◮ or the discrepancy measure being used.

◮ providing the correct functions for the arguments discrepancyFn and

generateFn is the responsibility of the user

◮ as with lineup’s generateSubject, the most important of these is

generateFn

◮ as with lineup’s showSubject, some discrepancyFn values will be better

than others for a given H0

◮ get either wrong and the test can become nonsensical.

WARNING: in most cases getting generateFn right is non-trivial and generally requires considerable statistical expertise. Much statistical research is focussed on this question! And also on which discrepancyFn is best suited to the particular H0.

SLIDE 110

On numerical tests

With numericalTest() in hand, the function numericalTestXY() might be re-written as

numericalTestXY <- function(x, y = NULL, nReps = 2000, discrepancyFn = discrepancyMeans, generateXY = generateXYcommonMean, returnStats = FALSE){ data <- if(is.null(y)) { list(x = x[[1]], y = x[[2]]) } else { list(x = x, y = y)} numericalTest(data = data, nReps = nReps, discrepancyFn = discrepancyFn, generateFn = generateXY, returnStats = returnStats) }