[PPT] - Class Averaging in Cryo-Electron Microscopy Zhizhen Jane Zhao PowerPoint Presentation

SLIDE 1

Class Averaging in Cryo-Electron Microscopy

Zhizhen Jane Zhao Courant Institute of Mathematical Sciences, NYU Mathematics in Data Sciences ICERM, Brown University July 30 2015

SLIDE 2

Single Particle Reconstruction (SPR) using cryo-EM

SLIDE 3

Challenges in Cryo-EM SPR

◮ Extremely low signal to noise

ratio because of the low electron dose.

◮ Need to process a large

number of images for desired resolution.

◮ With higher resolution

camera, the number of pixels in each image becomes larger.

◮ Requires efficient and

accurate algorithms.

SLIDE 4

Geometry of the Projection Image

◮ To each projection image I there

corresponds a 3 × 3 unknown rotation matrix R describing its

rientation.

R =   | | | R1 R2 R3 | | |   satisfying RRT = RTR = I, det R = 1.

◮ The projection image lies on a

tangent plane to the two dimensional unit sphere S2 at the viewing angle v = v(R) = R3.

SLIDE 5

Algorithmic Procedure for SPR

◮ Particle selection: manual, experimental image segmentation. ◮ Class averaging: classify images with similar viewing directions,

register and average to improve their signal-to-noise ratio (SNR). Clean Ii Ij Average

◮ Orientation estimation: use common lines to estimate the

rotation matrix R associated with each projection image (or class average).

◮ 3D Reconstruction: a 3D volume is generated by a tomographic

inversion algorithm.

◮ Iterative refinement

SLIDE 6

Clustering Method (Penczek, Zhu, Frank 1996)

◮ Suppose we have n images I1, . . . , In. ◮ Rotationally Invariant Distances (RID) for each pair of images is

dRID(i, j) = min

α∈[0,2π) Ii − Rα(Ij). ◮ Cluster the images using K-means. ◮ Problems:

1. Computationally expensive to compute all pairs of dRID. Na¨

ıve implementation requires O(n2) rotational alignment of images. Rotational invariant image representation, fast algorithms for dimensionality reduction and nearest neighbor search.

2. Noise and outliers: At low SNR images with completely different

views have relatively small dRID (noise aligns well, instead of underlying signal). Fast steerable PCA, Wiener filtering, and vector diffusion maps.

SLIDE 7

Hairy Ball Theorem and Global Alignment

◮ There is no non-vanishing continuous tangent vector field on the

sphere.

◮ Cannot find αi such that αij = αi − αj. ◮ No global in-plane rotational alignment of all images.

SLIDE 8

Our Class Averaging Procedure

(Z., Singer 2014)

SLIDE 9

Comparison with Existing Methods

104 simulated projection images of 70S ribosome at different signal to noise ratio. 50 nearest neighbors. Proportion of the nearest neighbors within a small spherical cap of 18.2◦.

RFA/K-means IMAGIC Relion EMAN2 Xmipp ASPIRE SNR= 1/50 0.45 0.97 0.79 0.74 0.83 1.00 SNR= 1/100 0.09 0.87 0.70 0.45 0.68 0.99 SNR= 1/150 0.07 0.67 0.52 0.13 0.48 0.90 Timing (hrs) 1.5 7.5 16 12 42 0.5

(a) IMAGIC (b) Relion (c) EMAN2 (d) Xmipp (e) ASPIRE (f) Reference

SLIDE 10

Ab initio Reconstruction of Experimental Data Sets

40, 778 images of 70S ribosome (Courtesy Dr Joachim Frank) Image size: 250 × 250 pixels. Pixel size: 1.5 ˚ A * Movie of 70S ribosome * Ab initio model resolution: 11.5 ˚ A.

SLIDE 11

Rotational Invariant Representation: Bispectrum

◮ Rotational invariant representation of images: “bispectrum”. ◮ For a 1D periodic discrete signal f(x), bispectrum is defined as

b(k1, k2) = ˆ f(k1)ˆ f(k2)ˆ f(−(k1 + k2)).

◮ Bispectrum is shift invariant, complete and unbiased.

SLIDE 12

Rotational Invariant Representation: Bispectrum

◮ Steerable basis: uk,q(r, φ) = gq(r)eıkφ. I(r, φ) = k,q ak,quk,q(r, φ). ◮ Image rotation by α accounts for phase shift ak,q → ak,qe−ıkα. ◮ Rotational invariant image representation:

bk1,k2,q1,q2,q3 = ak1,q1ak2,,q2a−(k1+k2),q3.

◮ The dimensionality of the feature vectors b is reduced by a

randomized algorithm for low rank matrix approximation (Rokhlin, Liberty, Tygert, Martinsson, Halko, Tropp, Szlam, . . . ).

◮ Find k nearest neighbors in nearly linear time using a

randomized algorithm for approximated nearest neighbors search (Jones, Osipov, Rokhlin 2011).

◮ Find the associated optimal alignment.

SLIDE 13

Small World Graph on S2

(Singer, Z., Shkolnisky, Hadani 2011, Singer, Wu 2012)

◮ Define graph G = (V, E) by {i, j} ∈ E if i, j are nearest neighbors. ◮ Optimal rotation angles αij gives dRID. ◮ Triplet consistency: αij + αjk + αki = 0 mod 2π. ◮ Vector diffusion maps explores the consistency in optimal

rotations systematically.

◮ Non-local means with rotations.

SLIDE 14

Affinity Based on Consistency of Transformations

◮ In the VDM framework, we define the affinity between i and l by

considering all paths of length t connecting them, but instead of just summing the weights of all paths, we sum the transformations. a) Higher affinity b) lower affinity

◮ Every path from i to l may result in a different transformation

(like parallel transport due to curvature).

◮ When adding transformations of different paths, cancellations

may happen.

SLIDE 15

Initial vs VDM Classification

Noisy (SNR = 1/50) centered 104 simulated projection images of 70S

ribosome. k = 200

60 120 180 1 2 3 4 5 6 x 10

5

acosvi, vj N 5 10 15 20 25 30 2 4 6 8 10 x 10

4

acosvi, vj N

Initial Classification VDM

SLIDE 16

Steerable PCA

(Z., Singer 2013, Z., Shkolnisky, Singer 2015)

◮ Inclusion of the rotated and reflected images for PCA in some

cases is advantageous.

◮ Many efficient algorithms transform images onto polar grid.

However, non-unitary transformation from Cartesian to polar grid changes the noise statistics.

◮ A digital image I is obtained by sampling a squared-integrable

bandlimited function f on a Cartesian grid of size L × L with bandlimit c. The underlying clean image is essentially compactly supported in a disk of radius R.

◮ Suppose we have n images, we introduce an algorithm with

computational complexity O(nL3 + L4) (O(nL4 + L6) for PCA).

SLIDE 17

Steerable Principal Components in Real Domain

λ = 163.8 λ = 163.8 λ = 160.0 λ = 160.0 λ = 158.1 λ = 158.1 λ = 153.9 λ = 153.9 λ = 153.4 λ = 153.4 λ = 153.2 λ = 153.2 λ = 150.3 λ = 150.3 λ = 144.7 λ = 144.7

SLIDE 18

Running Time

R PCA FFBsPCA 30 41.7 18.9 60 1460.6 23.8 90 1.1 × 104 34.5 120 5.0 × 104 48.2 150 1.2 × 105 96.7 180 – 111.0 210 – 139.2 240 – 158.6

Table: Running time for different PCA methods (in seconds), n = 103, c = 1/2, and R varies from 30 to 240. L = 2R.

SLIDE 19

Application: Denoising Images

(a) clean (b) noisy (c) PCA (d) FFBsPCA

Figure: Denoising 105 simulated projection images from human mitochondrial large ribosomal subunit. 240 × 240 pixels. The columns show (a) clean projection images, (b) noisy projection images with SNR= 1/30, (c) denoised projection images using traditional PCA, and (d) denoised images using FFBsPCA.

SLIDE 20

Outline of the Approach

◮ Truncated Fourier-Bessel expansion by a sampling criterion

(asymptotics of Bessel functions).

◮ Efficient and accurate evaluation of the expansion coefficients

(NUFFT and a Gaussian quadrature rule).

◮ Averaging over all rotations gives block diagonal structure in the

sample covariance matrix and reduces computational complexity.

SLIDE 21

Truncated Fourier-Bessel Expansion

◮ The scaled Fourier-Bessel functions are the eigenfunctions of the

Laplacian in a disk of radius c with Dirichlet boundary condition and they are given by ψk,q

c (ξ, θ) =

Nk,qJk
Rk,q

ξ c

eıkθ,

ξ ≤ c, 0, ξ > c,

◮ To avoid spurious information from noise outside of the compact

support R in real space, we select k’s and q’s that satisfy Rk,(q+1) ≤ 2πcR.

◮ F(f) is approximated by the finite expansion

Pc,RF(f)(ξ, θ) =

kmax

k=−kmax

pk

q=1

ak,qψk,q

c (ξ, θ). ◮ The Fourier-Bessel expansion coefficients are given by

ak,q = c Nk,qJk

Rk,q

ξ c

ξ dξ

2π F(f)(ξ, θ)e−ıkθdθ.

SLIDE 22

Evaluation of Fourier-Bessel Expansion Coefficients

◮ nξ = ⌈4cR⌉ and nθ = ⌈16cR⌉. ◮ Evaluate Fourier coefficients on a

polar grid using NUFFT. O(L2 log L + nξnθ).

◮ The angular integration is sped

up by 1D FFT on the concentric

circles. O(nξnθ log nθ).

◮ It is followed by a numerical

evaluation of the radial integral with a Gaussian quadrature rule. O(nξnθ). ak,q ≈

nξ

j=1

Nk,qJk

Rk,q

ξj c

F(I)(ξj, k)ξjw(ξj).

◮ Total computational cost is O(L2 log L).

SLIDE 23

Spectrum of the T∗T

The expansion coefficients vector a = T∗I. T∗T is almost unitary.

1000 2000 3000 4000 0.9 0.92 0.94 0.96 0.98 1 1.02 i λi 3000 6000 9000 0.88 0.92 0.96 1 i λi

R = 60, c = 1/3, L = 121 R = 90, c = 1/3, L = 181 These are also the spectra of the population covariance matrices of transformed white noise images.

SLIDE 24

Sample Covariance Matrix

In the Fourier-Bessel basis the covariance matrix C of the data with all rotations and reflection is given by C(k,q),(k′,q′) = 1 4πn

n

i=1

2π

ai

k,qai k′,q′ + ai −k,qai −k′,q′

e−ı(k−k′)αdα

= δk,k′ 1 2n

n

i=1

(ai

k,qai k,q′ + ai −k,qai −k,q′).

               

◮ We project the images onto kmax orthogonal subspaces with

different angular Fourier modes, C = kmax

k=0 C(k). ◮ The block size pk decreases as the angular frequency k increases. ◮ Computational complexity: O(nL3 + L4).

SLIDE 25

Selecting Steerable PCA components

◮ Steerable PCA basis and the associated expansion coefficients ck,l

are linear combinations of the Fourier-Bessel basis and the associated expansion coefficients using the eigenvectors of C(k)’s.

◮ For images corrupted by additive white Gaussian noise, given a

good estimation of the noise variance ˆ σ2, we select the components by the following criteria: for the (k)th block of the sample covariance matrix eigenvalues λ(k)

l ,

λ(k)

l

> ˆ σ2(1 + √γk)2 where γk =

pk (2−δk,0)n. ◮ The steerable PCA expansion coefficients are denoised by

Wiener-type filtering (Singer, Wu 2013).

SLIDE 26

Summary

◮ Our class averaging procedure is an efficient and accurate

computational framework for large cryo-EM image data set. The whole pipeline is almost linear with the number of images.

◮ Fast steerable PCA is used to efficiently compress and denoise

images.

◮ Rotationally invariant image representation and efficient

dimensionality reduction method generate features for fast nearest neighbor search.

◮ VDM uses the consistency of linear transformations to get better

estimation of nearest neighbors and alignment parameters.

◮ Methods described here can be applied to many other

applications, such as X-ray free electron Laser (XFEL) single particle reconstruction.

SLIDE 27

References

◮ Z. Zhao, Y. Shkolnisky, A. Singer, “Fast Steerable Principal

Component Analysis”, Submitted, Available at http://arxiv.org/abs/1412.0781.

◮ Z. Zhao, A. Singer, “Rotationally Invariant Image Representation

for Viewing Direction Classification in Cryo-EM”, Journal of Structural Biology, 186 (1), pp. 153-166 (2014).

◮ Z. Zhao, A. Singer, “Fourier-Bessel Rotational Invariant

Eigenimages”, The Journal of the Optical Society of America A, 30 (5), pp. 871-877 (2013).

◮ A. Singer, H.-T. Wu, “Vector Diffusion Maps and the Connection

Laplacian”, Communications on Pure and Applied Mathematics, 65 (8), pp. 1067-1144 (2012).

◮ A. Singer, Z. Zhao, Y. Shkolnisky, R. Hadani, “Viewing Angle

Classification of Cryo-Electron Microscopy Images using Eigenvectors”, SIAM Journal on Imaging Sciences, 4 (2), pp. 723-759 (2011).

SLIDE 28

ASPIRE: Algorithms for Single Particle Reconstruction

Open source toolbox: spr.math.princeton.edu

SLIDE 29

Class Averaging in Cryo-Electron Microscopy

Zhizhen Jane Zhao Courant Institute of Mathematical Sciences, NYU Mathematics in Data Sciences ICERM, Brown University July 30 2015

Single Particle Reconstruction (SPR) using cryo-EM

Challenges in Cryo-EM SPR

ratio because of the low electron dose.

number of images for desired resolution.

camera, the number of pixels in each image becomes larger.

accurate algorithms.

Geometry of the Projection Image

corresponds a 3 × 3 unknown rotation matrix R describing its

R =   | | | R1 R2 R3 | | |   satisfying RRT = RTR = I, det R = 1.

tangent plane to the two dimensional unit sphere S2 at the viewing angle v = v(R) = R3.

Algorithmic Procedure for SPR

register and average to improve their signal-to-noise ratio (SNR). Clean Ii Ij Average

rotation matrix R associated with each projection image (or class average).

inversion algorithm.

Clustering Method (Penczek, Zhu, Frank 1996)

dRID(i, j) = min

ıve implementation requires O(n2) rotational alignment of images. Rotational invariant image representation, fast algorithms for dimensionality reduction and nearest neighbor search.

views have relatively small dRID (noise aligns well, instead of underlying signal). Fast steerable PCA, Wiener filtering, and vector diffusion maps.

Hairy Ball Theorem and Global Alignment

sphere.

Our Class Averaging Procedure

(Z., Singer 2014)

Comparison with Existing Methods

104 simulated projection images of 70S ribosome at different signal to noise ratio. 50 nearest neighbors. Proportion of the nearest neighbors within a small spherical cap of 18.2◦.

(a) IMAGIC (b) Relion (c) EMAN2 (d) Xmipp (e) ASPIRE (f) Reference

Ab initio Reconstruction of Experimental Data Sets

40, 778 images of 70S ribosome (Courtesy Dr Joachim Frank) Image size: 250 × 250 pixels. Pixel size: 1.5 ˚ A ******************************************* Movie of 70S ribosome ******************************************* Ab initio model resolution: 11.5 ˚ A.

Rotational Invariant Representation: Bispectrum

b(k1, k2) = ˆ f(k1)ˆ f(k2)ˆ f(−(k1 + k2)).

Rotational Invariant Representation: Bispectrum

bk1,k2,q1,q2,q3 = ak1,q1ak2,,q2a−(k1+k2),q3.

randomized algorithm for low rank matrix approximation (Rokhlin, Liberty, Tygert, Martinsson, Halko, Tropp, Szlam, . . . ).

randomized algorithm for approximated nearest neighbors search (Jones, Osipov, Rokhlin 2011).

Small World Graph on S2

(Singer, Z., Shkolnisky, Hadani 2011, Singer, Wu 2012)

rotations systematically.

Affinity Based on Consistency of Transformations

considering all paths of length t connecting them, but instead of just summing the weights of all paths, we sum the transformations. a) Higher affinity b) lower affinity

(like parallel transport due to curvature).

may happen.

Initial vs VDM Classification

Noisy (SNR = 1/50) centered 104 simulated projection images of 70S

Initial Classification VDM

Steerable PCA

(Z., Singer 2013, Z., Shkolnisky, Singer 2015)

cases is advantageous.

However, non-unitary transformation from Cartesian to polar grid changes the noise statistics.

bandlimited function f on a Cartesian grid of size L × L with bandlimit c. The underlying clean image is essentially compactly supported in a disk of radius R.

computational complexity O(nL3 + L4) (O(nL4 + L6) for PCA).

Steerable Principal Components in Real Domain

λ = 163.8 λ = 163.8 λ = 160.0 λ = 160.0 λ = 158.1 λ = 158.1 λ = 153.9 λ = 153.9 λ = 153.4 λ = 153.4 λ = 153.2 λ = 153.2 λ = 150.3 λ = 150.3 λ = 144.7 λ = 144.7

Running Time

R PCA FFBsPCA 30 41.7 18.9 60 1460.6 23.8 90 1.1 × 104 34.5 120 5.0 × 104 48.2 150 1.2 × 105 96.7 180 – 111.0 210 – 139.2 240 – 158.6

Table: Running time for different PCA methods (in seconds), n = 103, c = 1/2, and R varies from 30 to 240. L = 2R.

Application: Denoising Images

(a) clean (b) noisy (c) PCA (d) FFBsPCA

Outline of the Approach

(asymptotics of Bessel functions).

(NUFFT and a Gaussian quadrature rule).

sample covariance matrix and reduces computational complexity.

Truncated Fourier-Bessel Expansion

Laplacian in a disk of radius c with Dirichlet boundary condition and they are given by ψk,q

ξ ≤ c, 0, ξ > c,

support R in real space, we select k’s and q’s that satisfy Rk,(q+1) ≤ 2πcR.

Pc,RF(f)(ξ, θ) =

ak,qψk,q

ak,q = c Nk,qJk

ξ c

2π F(f)(ξ, θ)e−ıkθdθ.

Evaluation of Fourier-Bessel Expansion Coefficients

polar grid using NUFFT. O(L2 log L + nξnθ).

up by 1D FFT on the concentric

evaluation of the radial integral with a Gaussian quadrature rule. O(nξnθ). ak,q ≈

Nk,qJk

ξj c

Spectrum of the T∗T

The expansion coefficients vector a = T∗I. T∗T is almost unitary.

R = 60, c = 1/3, L = 121 R = 90, c = 1/3, L = 181 These are also the spectra of the population covariance matrices of transformed white noise images.

40, 778 images of 70S ribosome (Courtesy Dr Joachim Frank) Image size: 250 × 250 pixels. Pixel size: 1.5 ˚ A * Movie of 70S ribosome * Ab initio model resolution: 11.5 ˚ A.