Analysis of gene copy number changes in tumor phylogenetics Jun - - PowerPoint PPT Presentation

analysis of gene copy number changes in tumor
SMART_READER_LITE
LIVE PREVIEW

Analysis of gene copy number changes in tumor phylogenetics Jun - - PowerPoint PPT Presentation

Analysis of gene copy number changes in tumor phylogenetics Jun Zhou, Yu Lin, Vaibhav Rajan, Willia Hoskins, Bing Feng and Jijun Tang Background u Cancer can be modeled by molecular evolutionary processes (specifically deletion, insertion,


slide-1
SLIDE 1

Analysis of gene copy number changes in tumor phylogenetics

Jun Zhou, Yu Lin, Vaibhav Rajan, Willia Hoskins, Bing Feng and Jijun Tang

slide-2
SLIDE 2

Background

u Cancer can be modeled by

molecular evolutionary processes (specifically deletion, insertion, etc.)

u A mutational phylogenetic tree

can be built with nodes as clones and subclones and directed edges as mutation processes.

Image from Davis A. and Navin N., 2016

slide-3
SLIDE 3

In this paper…

u Genetic marker: Copy number of an array of genes

detected by Fluorescent In Situ Hybridization (FISH) at single-cell level

u Caused by insertion/deletion of genes u Caused by chromosomal aberrations u Data structure: u A clone is represented as a tuple of copy numbers u A patient is represented as a matrix of copy numbers ->

main (and only) input to the phylogenetic problem

slide-4
SLIDE 4

Problem formulation

u Distance-based Minimum Tree u NP-hard

slide-5
SLIDE 5

Minimum Spanning Tree

u Input: A set of vertices 𝑊 and a 𝑊 × 𝑊 distance matrix OR

a metric system.

u Output: A 1-connected tree T = (𝑊, 𝐹) with minimum

weight (sum of distances of vertices connected by an edge)

u Prim’s/Kruskal’s greedy algorithm in polynomial time

slide-6
SLIDE 6

Steiner Minimum Tree

u Steiner nodes: unobserved nodes (absent in the dataset) u Input: A set of vertices 𝑊 and a metric system. u Output: A 1-connected tree T = 𝑊), 𝐹 where 𝑊) ⊇ 𝑊 with

minimum weight (sum of distances of vertices connected by an edge)

u NP-hard u For the case of 𝑊 = 3, this reduces to the Median

problem.

u Sankoff’s algorithm in linear time

slide-7
SLIDE 7

Steiner Minimum Tree – Sankoff’s algorithm

Image from Zhou J. et al., 2016

slide-8
SLIDE 8

Rectilinear Steiner Minimum Tree (RSMT)

u Rectilinear (Manhattan) metric: sum of absolute

difference between corresponding positions from 2 tuples

u Input: A set of vertices 𝑊. u Output: A 1-connected tree T = 𝑊), 𝐹 where 𝑊) ⊇ 𝑊 with

minimum weight (sum of distances of vertices connected by an edge) under the rectilinear metric.

slide-9
SLIDE 9

RSMT Exact Algorithm

u Hanan’s Theorem for 2-D problem: There exists a RSMT

containing only Steiner points from the Hanan’s grid

u Solution space is bounded u Generalized for n-D u Exact (and naïve) algorithm would enumerate all possible

sets of Steiner nodes, compute Minimum Spanning Tree on the new tree and compute the weight

slide-10
SLIDE 10

RSMT Heuristics

u Inspiration from the Median problem u Sankoff’s algorithm in linear time u Inspiration from Maximum parsimony problem u Maximum parsimony tree: heuristics of MP borrowed

from TNT package

slide-11
SLIDE 11

RSMT Heuristics from Minimum Spanning Tree – Sankoff’s Algorithm revisited

Image from Zhou J. et al., 2016

slide-12
SLIDE 12

Image from Zhou J. et al., 2016

RSMT Heuristics from Minimum Spanning Tree – iterative Sankoff’s Algorithm

slide-13
SLIDE 13

RSMT Heuristics from Minimum Spanning Tree (MST)

u Minimizing number of Steiner nodes added by carefully

selecting which nodes to add first.

u Steiner count for an observed node A: the number of

triplets containing A that require a Steiner node to

  • ptimize tree weight

u Inference score for Steiner nodes: sum of Steiner counts

in the triplet defining it

slide-14
SLIDE 14

Image from Zhou J. et al., 2016

slide-15
SLIDE 15

u MP heuristics (TNT package) to derive a tree whose leaves

contains the dataset

u Dynamic programming to assign states to internal nodes u Contract trivial edges (edge with weight 0 under

rectilinear metric)

RSMT heuristics from Maximum Parsimony (MP)

slide-16
SLIDE 16

RSMT heuristics from Maximum Parsimony (MP)

Image from Zhou J. et al., 2016

slide-17
SLIDE 17

RSMT heuristics from Maximum Parsimony (MP)

Image from Zhou

  • J. et

al., 2016

slide-18
SLIDE 18

Results for Breast Cancer real data

Image from Zhou

  • J. et

al., 2016

slide-19
SLIDE 19

RSMT Results for simulated data

Image from Zhou J. et al., 2016

slide-20
SLIDE 20

Thank you

u

QA

slide-21
SLIDE 21

Duplication Steiner Minimum Tree (DSMT) (Chowdhury et. al)

u Input: A set of vertices 𝑊. u Output: A 1-connected tree T = 𝑊), 𝐹 where 𝑊) ⊇ 𝑊 with

minimum weight (sum of distances of vertices connected by an edge) under a generalized metric to incorporate large scale duplication events.

slide-22
SLIDE 22

DSMT Results for simulated data

Image from Zhou J. et al., 2016