Outline Network alignment and querying PathBLAST Color coding and - - PDF document

outline
SMART_READER_LITE
LIVE PREVIEW

Outline Network alignment and querying PathBLAST Color coding and - - PDF document

4/24/09 CSCI1950Z Computa4onal Methods for Biology Lecture 22 Ben Raphael April 22, 2009 hGp://cs.brown.edu/courses/csci1950z/ Outline Network alignment and querying PathBLAST Color coding and randomized algorithms. 1 4/24/09


slide-1
SLIDE 1

4/24/09 1

CSCI1950‐Z Computa4onal Methods for Biology Lecture 22

Ben Raphael April 22, 2009

hGp://cs.brown.edu/courses/csci1950‐z/

Outline

Network alignment and querying

  • PathBLAST
  • Color coding and randomized algorithms.
slide-2
SLIDE 2

4/24/09 2

  • Goal: iden4fy conserved pathways (chains)
  • Idea: can be done efficiently by dynamic

programming if networks are DAGs

Kelley et al (2003)

D D’

+ match

PathBLAST

C X’

+ mismatch

B

+ gap

A A ’

Score: match

Why paths?

slide-3
SLIDE 3

4/24/09 3

PathBLAST

(Kelley, et al. PNAS 2003)

  • Find conserved

pathways in protein interac4on maps of two species

  • Model & Scoring:

(Whiteboard)

PathBLAST Scoring

slide-4
SLIDE 4

4/24/09 4

  • Problem: Networks are neither acyclic nor directed
  • Solu4on: Randomize

Impose random ordering on nodes, perform DP; repeat many 4mes

  • On average, highest scoring path preserved in 2/L! subgraphs
  • Finds conserved paths of length L within networks of size n in O(L!n)

expected 4me

  • Drawbacks

– Computa4onally expensive – Restricts search to specific topology Kelley et al (2003)

PathBLAST

1 4 2 3 5 2 1 4 5 3 5 2 1 3 4

PathBLAST

slide-5
SLIDE 5

4/24/09 5

PathBLAST: Computa4onal Formula4on

  • I = {start ver4ces}, e.g.

receptors.

  • Goal: Find highest scoring

paths I  v for all v in G.

ScoG, et al. JCB 2006

v

PathBLAST: Computa4onal Formula4on

  • Given:

– Undirected weighted graph G = (V, E, w) – Set of start ver4ces I, and end vertex v,

  • Find: a minimum‐weight simple path P

= (v1,e1,v2, e2, …, ek‐1, vk) star4ng in I and ending at v:

– v1 in I and vk = v.

  • Recall: Simple path vi ≠ vj if i ≠ j
  • NP‐hard in general (reduc4on from

TSP)

  • Let wk(v) = weight of above.

– Dynamic programming solu4on (whiteboard)

ScoG, et al. JCB 2006

v

slide-6
SLIDE 6

4/24/09 6

Color‐coding (Alon, Yuster, & Zwick)

  • Assign each vertex random color

between 1 and k.

  • Colorful path: path w/ dis4nct colors.
  • Colorful path  simple path.
  • Goal: find colorful paths

– Dynamic programming solu4on (whiteboard)

  • High‐scoring path not discovered

when two ver4ces have same color.

  • Repeat for many random colorings.

(How many?)

l v

Adding extra constraints

  • Require a protein: assign it a unique color.
  • Require a specific number of proteins from a

set T: W(v, S, c) = min. weight of path … (same as above) and contains exactly c ver4ces in T.

  • Order constraint on proteins in path

– Membrane proteins   transcrip4on factors.

slide-7
SLIDE 7

4/24/09 7

Adding extra constraints

Rooted trees: Rooted at v. Every leaf is in I.

v

Color‐coding (Alon, Yuster, & Zwick)

  • Extends to many other cases of subgraph

isomorphism problem:

– Does a graph G have a subgraph isomorphic to graph H?

  • H = simple path of length k.
  • H = simple cycle of length k.
  • H = tree.
  • H = graph of fixed (bounded) tree‐width
slide-8
SLIDE 8

4/24/09 8

Addi4onal Problems

  • 1. Efficient querying of a network (e.g. QNET)
  • 2. Find conserved subgraphs

Heavy subgraphs in product graph

  • 3. Mul4ple network alignment

Sources

  • Kelley BP, Sharan R, Karp RM, SiGler T, Root DE,

Stockwell BR, Ideker T. (2003) Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proc Natl Acad Sci U S A. 100(20):11394‐9.

  • ScoG J, Ideker T, Karp RM, Sharan R. (2006)

Efficient algorithms for detec4ng signaling pathways in protein interac4on networks. J Comput Biol.. 13(2):133‐44.

  • Alon, N., Yuster, R., and Zwick, U. (1995). Color‐
  • coding. J. ACM 42, 4.