Pre-process the data John Blischak Instructor DataCamp - - PowerPoint PPT Presentation

pre process the data
SMART_READER_LITE
LIVE PREVIEW

Pre-process the data John Blischak Instructor DataCamp - - PowerPoint PPT Presentation

DataCamp Differential Expression Analysis with limma in R DIFFERENTIAL EXPRESSION ANALYSIS WITH LIMMA IN R Pre-process the data John Blischak Instructor DataCamp Differential Expression Analysis with limma in R Mechanism of


slide-1
SLIDE 1

DataCamp Differential Expression Analysis with limma in R

Pre-process the data

DIFFERENTIAL EXPRESSION ANALYSIS WITH LIMMA IN R

John Blischak

Instructor

slide-2
SLIDE 2

DataCamp Differential Expression Analysis with limma in R

Mechanism of doxorubicin-induced cardiotoxicity

2x2 design to study mechanism of doxorubicin: 2 genotypes: wild type (wt), Top2b null (top2b) 2 treatments: PBS (pbs), doxorubicin (dox) Zhang et al. 2012

dim(eset) Features Samples 29532 12 table(pData(eset)[, c("genotype", "treatment")]) treatment genotype dox pbs top2b 3 3 wt 3 3

slide-3
SLIDE 3

DataCamp Differential Expression Analysis with limma in R

Inspect the features

plotDensities(eset, group = pData(eset)[, "genotype"], legend = "topright")

slide-4
SLIDE 4

DataCamp Differential Expression Analysis with limma in R

Pre-processing steps

Log transform Quantile normalize Filter

slide-5
SLIDE 5

DataCamp Differential Expression Analysis with limma in R

Sanity check: Boxplot of Top2b

boxplot(<y-axis> ~ <x-axis>, main = "<title>") boxplot(<gene expression> ~ <phenotype>, main = "<feature>") boxplot(<Top2b expression> ~ <genotype>, main = "<Top2b info>")

slide-6
SLIDE 6

DataCamp Differential Expression Analysis with limma in R

Check sources of variation

Principal components analysis limma function plotMDS How many clusters do you anticipate?

slide-7
SLIDE 7

DataCamp Differential Expression Analysis with limma in R

Let's practice!

DIFFERENTIAL EXPRESSION ANALYSIS WITH LIMMA IN R

slide-8
SLIDE 8

DataCamp Differential Expression Analysis with limma in R

Model the data

DIFFERENTIAL EXPRESSION ANALYSIS WITH LIMMA IN R

John Blischak

Instructor

slide-9
SLIDE 9

DataCamp Differential Expression Analysis with limma in R

Ready for analysis

# Plot principal components labeled by genotype plotMDS(eset, labels = pData(eset)[, "genotype"], gene.selection = "common") # Plot principal components labeled by treatment plotMDS(eset, labels = pData(eset)[, "treatment"], gene.selection = "common")

slide-10
SLIDE 10

DataCamp Differential Expression Analysis with limma in R

Steps for differential expression analysis

Build the design matrix with model.matrix Contruct the contrasts matrix with makeContrasts Test the contrasts with lmFit, contrasts.fit, and eBayes

slide-11
SLIDE 11

DataCamp Differential Expression Analysis with limma in R

Group-means model for doxorubicin study

Y = β X + β X + β X + β X + ϵ β - Mean expression level in top2b mice treated with dox β - Mean expression level in top2b treated with pbs β - Mean expression level in wt mice treated with dox β - Mean expression level in wt mice treated with pbs

1 1 2 2 3 3 4 4 1 2 3 4

slide-12
SLIDE 12

DataCamp Differential Expression Analysis with limma in R

Contrasts for doxorubicin study

β β β β

genotype

top2b top2b wt wt

treatment

dox pbs dox pbs

Response of wild type mice to dox treatment: β − β = 0 Response of Top2b null mice to dox treatment: β − β = 0 Differences between Top2b null and wild type mice in response to dox treatment: (β − β ) − (β − β ) = 0

1 2 3 4

3 4 1 2 1 2 3 4

slide-13
SLIDE 13

DataCamp Differential Expression Analysis with limma in R

Testing the doxorubicin study

Fit the model coefficients with lmFit Fit the contrasts with contrasts.fit Calculate the t-statistics with eBayes

# Summarize results results <- decideTests(fit2) summary(results) # Create a Venn diagram vennDiagram(results)

slide-14
SLIDE 14

DataCamp Differential Expression Analysis with limma in R

Let's practice!

DIFFERENTIAL EXPRESSION ANALYSIS WITH LIMMA IN R

slide-15
SLIDE 15

DataCamp Differential Expression Analysis with limma in R

Inspect the results

DIFFERENTIAL EXPRESSION ANALYSIS WITH LIMMA IN R

John Blischak

Instructor

slide-16
SLIDE 16

DataCamp Differential Expression Analysis with limma in R

Inspect the results

# Create a Venn diagram vennDiagram(results)

slide-17
SLIDE 17

DataCamp Differential Expression Analysis with limma in R

Histograms of p-values

limma function topTable Specify contrast

coef = "dox_wt" coef = "dox_top2b" coef = "interaction"

hist(runif(10000))

slide-18
SLIDE 18

DataCamp Differential Expression Analysis with limma in R

Volcano plot

limma function volcanoplot x-axis: log-fold change in expression between contrasted groups y-axis: log odds of differential expression Specify contrast, e.g. coef = "dox_wt"

slide-19
SLIDE 19

DataCamp Differential Expression Analysis with limma in R

Testing for KEGG enrichment

limma function kegga and topKEGG Specify contrast, e.g. coef = "dox_wt" Specify mouse species with species = "Mm" Mm == Mus musculus

slide-20
SLIDE 20

DataCamp Differential Expression Analysis with limma in R

Let's practice!

DIFFERENTIAL EXPRESSION ANALYSIS WITH LIMMA IN R

slide-21
SLIDE 21

DataCamp Differential Expression Analysis with limma in R

Conclusion

DIFFERENTIAL EXPRESSION ANALYSIS WITH LIMMA IN R

John Blischak

Instructor

slide-22
SLIDE 22

DataCamp Differential Expression Analysis with limma in R

Pre-processing features

Visualize with plotDensities 3 steps: Log transform Normalize Filter

slide-23
SLIDE 23

DataCamp Differential Expression Analysis with limma in R

Visualize single genes with boxplot

top2b <- which(fData(eset)[, "symbol"] == "Top2b") boxplot(exprs(eset)[top2b, ] ~ pData(eset)[, "genotype"], main = fData(eset)[top2b, ])

slide-24
SLIDE 24

DataCamp Differential Expression Analysis with limma in R

Check sources of variation

# Plot principal components labeled by genotype plotMDS(eset, labels = pData(eset)[,"genotype"], gene.selection = "common") # Plot principal components labeled by treatment plotMDS(eset, labels = pData(eset)[,"treatment"], gene.selection = "common")

slide-25
SLIDE 25

DataCamp Differential Expression Analysis with limma in R

Flexibly testing various study designs

# Create single variable group <- with(pData(eset), paste(genotype, treatment, sep = ".")) group <- factor(group) # Create design matrix with no intercept design <- model.matrix(~0 + group) colnames(design) <- levels(group) # Create a contrasts matrix cm <- makeContrasts(dox_wt = wt.dox - wt.pbs, dox_top2b = top2b.dox - top2b.pbs, interaction = (top2b.dox - top2b.pbs) - (wt.dox - wt.pbs), levels = design) # Fit the model fit <- lmFit(eset, design) # Fit the contrasts fit2 <- contrasts.fit(fit, contrasts = cm) # Calculate the t-statistics for the contrasts fit2 <- eBayes(fit2)

slide-26
SLIDE 26

DataCamp Differential Expression Analysis with limma in R

Histograms of p-values

topTable and hist

slide-27
SLIDE 27

DataCamp Differential Expression Analysis with limma in R

Volcano plots

# Extract the gene symbols gene_symbols <- fit2$genes[, "symbol"] # Create a volcano plot for the contrast dox_wt volcanoplot(fit2, coef = "dox_wt", highlight = 5, names = gene_symbols)

slide-28
SLIDE 28

DataCamp Differential Expression Analysis with limma in R

Test for enrichment of gene sets

# Extract the entrez gene IDs entrez <- fit2$genes[, "entrez"] # Test for enriched KEGG Pathways for contrast dox_wt enrich_dox_wt <- kegga(fit2, coef = "dox_wt", geneid = entrez, species = "Mm") # View the top 5 enriched KEGG pathways topKEGG(enrich_dox_wt, number = 5) Pathway path:mmu05322 Systemic lupus erythematosus path:mmu03008 Ribosome biogenesis in eukaryotes path:mmu05034 Alcoholism path:mmu05412 Arrhythmogenic right ventricular cardiomyopathy (ARVC) path:mmu05330 Allograft rejection N Up Down P.Up P.Down path:mmu05322 76 37 1 3.657708e-10 9.999999e-01 path:mmu03008 71 34 4 3.320811e-09 9.997410e-01 path:mmu05034 130 47 17 2.358456e-07 9.733029e-01 path:mmu05412 52 2 26 9.995025e-01 4.140466e-07 path:mmu05330 26 16 0 6.720834e-07 1.000000e+00

slide-29
SLIDE 29

DataCamp Differential Expression Analysis with limma in R

Congrats on completing the course!

DIFFERENTIAL EXPRESSION ANALYSIS WITH LIMMA IN R