1
Applying Motion Planning Techniques to Molecular Docking
Mark Moll
Physical and Biological Computing Group Computer Science Department Rice University
Applying Motion Planning Techniques to Molecular Docking Mark - - PowerPoint PPT Presentation
Applying Motion Planning Techniques to Molecular Docking Mark Moll Physical and Biological Computing Group Computer Science Department Rice University 1 Molecular Docking Goal: quickly determine good fit between ligand (shown in
1
Physical and Biological Computing Group Computer Science Department Rice University
2
Goal: quickly determine ‘good fit’ between ligand (shown in green) and receptor. HIV-1 protease
3
molecular dynamics of HIV-1 protease, simulated in water; 2ns takes about 1 week on cluster of 16 machines, 9360 DOF Problem: receptor is flexible, and has thousands of degrees of freedom
4
Important problem in computational drug design. Used in virtual screening of ligand databases. Current docking methods only allow for limited flexibility
5
Soft potentials: modify energy function, so that ligand fits more easily in binding site
(Jiang & Kim 1991; Schnecke et al. 1998; Apostolakis et al. 1998)
Selection of a few critical degrees of freedom in receptor binding site (Leach 1994; Leach & Lemon 1998) Use multiple receptor conformations
(Pang & Kozikowski 1994; Knegtel et al. 1997; Sudbeck et al. 1998)
Modified molecular dynamics methods
(Di Nola et al. 1994; Mangoni et al. 1999; Nakajima et al. 1997)
Collective degrees of freedom
(Levy & Karplus 1979; Levitt et al. 1983; Garcia 1992; Teodoro et al. 2003)
6
Capturing the essential degrees of freedom Biased expansive search using molecular energy Evaluating search results Simulation results Discussion
7
Motions of atoms are not independent Find the main modes of motion by performing Principal Component Analysis (PCA) on molecular dynamics trajectory Use first couple principal components to represent the essential degrees of freedom Potential drawbacks:
Molecular motions are mainly rotations, whereas PCA is linear decomposition Overfitting: we do not want to model random loop fluctuations
8
9
First essential degree of freedom: Over 60% of variability in data captured by 5 degrees of freedom!
10
For systems with approximate symmetry (like HIV-1 protease) can impose constraints to extract more robust modes of motion. Sorensen & Shah, our collaborators in the Comp. & Applied Math. Dept., have developed a symmetry preserving version of SVD. Using symmetry constraint averages out some random fluctuations, but preserves essential motions. Robustness verified by looking at how well other known crystal structures are approximated by symmetry preserving major modes. (Very recent results; not yet used for results presented today!)
11
(based on Hsu & Latombe’s algorithm) Search only conformational space of receptor Sample conformations along principal components ⇒ use local energy minimization to compensate for distortion Create new conformations with an expansive search:
Randomly select previously generated conformation Perturb it to generate a new conformation
Biased towards low-energy conformations Biased towards unexplored parts of the search space Search strategy should have the following properties:
12
Pick conformation i with probability inversely proportional to its weight: wi =
where γ = a constant controlling the sensitivity to energy, Ei = the energy of conformation i, Emin = min
i=1,...,n Ei,
ci = number of times conformation i has been selected, and di = sum of distances to k nearest neighbors. where
13
high energy low energy high energy q qnew
Instead of simply perturbing conformation q to generate qnew, we can potentially do better with a random walk: Energy barrier at min(Eq + Erel, Eabs)
14
Number of distinct low-energy conformations (low energy ≡ energy of crystal structs. + small tolerance or lower) how well are we doing within the model? Distance to other crystal structures how does it compare to experimental data? Diameter of set of conformations is the search expansive? Evaluation of a search is non-trivial. Need to consider several criteria:
15
Tested docking program on two systems:
HIV-1 protease, 3120 atoms, ~110 crystal structures available, rotationally symmetric FK506 binding protein, 1663 atoms, ~70 crystal structures available, very flexible
Use 5 principal components, computed from molecular dynamics simulation Compare regular neighbor selection with random bounce walk Vary parameters for random bounce walk: energy thresholds, number of steps Take average over 5 runs Experimental setup:
16
HIV-1 protease (4hvp) FK binding protein (1fkr17) 100 200 300 400
172 361 166 385 25 40 regular perturbation 10 step random walk 20 step random walk
Low-energy conformations that are at least 1Å RMSD apart:
Results measured after 20 hours.
17
HIV-1 protease (4hvp) FK binding protein (1fkr17) 0% 2% 4% 6% 8%
6.9% 7.2% 6.6% 7.7% 0.3% 0.4% regular perturbation 10 step random walk 20 step random walk
Low-energy conformations that are at least 1Å RMSD apart, measured as percentage of total #conformations generated:
Results measured after 20 hours.
18
average crystal structs. min (cal/mol) 4hvp regular
10 steps
20 steps
1fkr17 regular
10 steps
20 steps
Energy is higher on average with random bounce walk, which is to be expected.
19
Energy-guided expansive search effective at finding low-energy conformations But search is not expanding much towards crystal structures Can potentially improve results by
using free energy (i.e., include entropic effects) using more (weighted) principal components
20
Lydia Kavraki David Schwartz Allison Heath Danny Sorensen Mili Shah Cecilia Clementi NSF, Whitaker Foundation Physical & Biological Computing Group:
Chemistry Department: Funding: @ Rice University