A compact Arnoldi algorithm for polynomial eigenvalue problems - - PowerPoint PPT Presentation

a compact arnoldi algorithm for polynomial eigenvalue
SMART_READER_LITE
LIVE PREVIEW

A compact Arnoldi algorithm for polynomial eigenvalue problems - - PowerPoint PPT Presentation

A compact Arnoldi algorithm for polynomial eigenvalue problems Yangfeng Su, Junyi Zhang, Zhaojun Bai School of Mathematical Sciences Fudan University January, 2008, Taiwan RANMEP2008 Goals Polynomial Eigenvalue Problems (PEPs): A 0


slide-1
SLIDE 1

A compact Arnoldi algorithm for polynomial eigenvalue problems

Yangfeng Su, Junyi Zhang, Zhaojun Bai

School of Mathematical Sciences Fudan University

January, 2008, Taiwan RANMEP2008

slide-2
SLIDE 2

Goals

Polynomial Eigenvalue Problems (PEPs):

  • A0 + λA1 + ... + λdAd
  • x = 0

(λ, x) is called an eigenpair. d=1, SEP,GEP d=2, QEP ( Quadratic Eigenvalue Problem) Goals Solving large scale polynomial eigenvalue problems with Implicitly Restarted Arnoldi (IRA) algorithm with less memory.

slide-3
SLIDE 3

Goals

Polynomial Eigenvalue Problems (PEPs):

  • A0 + λA1 + ... + λdAd
  • x = 0

(λ, x) is called an eigenpair. d=1, SEP,GEP d=2, QEP ( Quadratic Eigenvalue Problem) Goals Solving large scale polynomial eigenvalue problems with Implicitly Restarted Arnoldi (IRA) algorithm with less memory.

slide-4
SLIDE 4

Outline

1 Background 2 Algorithm 3 Numerical Comparison

slide-5
SLIDE 5

Outline

1 Background 2 Algorithm 3 Numerical Comparison

slide-6
SLIDE 6

Linearization

Linearization ⇒ A larger GEP:           A0 I ... I      − λ      −A1 −A2 . . . −Ad I ... I                x λx . . . λd−1x      = 0 Sleijpen, van der Vorst, and van Gijzen. Quadratic eigenproblems are no problem, SIAM News, 1996 Can be solved with ARPACK Not true: structure is neither used nor preserved. size and therefore memory are d times! This work: explore the structure to save memory in ARPACK. It is a byproduct of SOAR [Bai & S. SIMAX 2005].

slide-7
SLIDE 7

Linearization

Linearization ⇒ A larger GEP:           A0 I ... I      − λ      −A1 −A2 . . . −Ad I ... I                x λx . . . λd−1x      = 0 Sleijpen, van der Vorst, and van Gijzen. Quadratic eigenproblems are no problem, SIAM News, 1996 Can be solved with ARPACK Not true: structure is neither used nor preserved. size and therefore memory are d times! This work: explore the structure to save memory in ARPACK. It is a byproduct of SOAR [Bai & S. SIMAX 2005].

slide-8
SLIDE 8

Linearization

Linearization ⇒ A larger GEP:           A0 I ... I      − λ      −A1 −A2 . . . −Ad I ... I                x λx . . . λd−1x      = 0 Sleijpen, van der Vorst, and van Gijzen. Quadratic eigenproblems are no problem, SIAM News, 1996 Can be solved with ARPACK Not true: structure is neither used nor preserved. size and therefore memory are d times! This work: explore the structure to save memory in ARPACK. It is a byproduct of SOAR [Bai & S. SIMAX 2005].

slide-9
SLIDE 9

Linearization

Linearization ⇒ A larger GEP:           A0 I ... I      − λ      −A1 −A2 . . . −Ad I ... I                x λx . . . λd−1x      = 0 Sleijpen, van der Vorst, and van Gijzen. Quadratic eigenproblems are no problem, SIAM News, 1996 Can be solved with ARPACK Not true: structure is neither used nor preserved. size and therefore memory are d times! This work: explore the structure to save memory in ARPACK. It is a byproduct of SOAR [Bai & S. SIMAX 2005].

slide-10
SLIDE 10

Arnoldi decomposition

An Arnoldi decomposition of order-j: OPQj = QjHj + hj+1,jqj+1eT

j

where OP: a matrix Qj = [q1, q2, . . . , qj]: orthonormal Hj: upper Hessenberg Arnoldi process is used to compute the Arnoldi decomposition with numerical stability

slide-11
SLIDE 11

Implicitly Restarted Arnoldi (IRA) for SEP

Sorensen [SIMAX92] Given an Arnoldi decomposition of order-p, OPQp = QpHp + βpqp+1eT

p .

1 Extend Arnoldi decomposition from order p to order k:

AQk = QkHk + βkqk+1eT

k .

2 Divide the eigenvalues of Hk as “good” ones µ1, . . . , µp and

“bad” ones µp+1, . . . , µk.

3 For Hk do implicitly QR steps with shifts µk+1, . . . , µk, get

Hk = U ˜ HkUH

4 Take first p columns of

AQkU = QkU ˜ Hk + βkqk+1eT

k U

as a restarted Arnoldi decomposition of order p.

slide-12
SLIDE 12

ARPACK

ARPACK is an implementation of IRA algorithm a well-coded, well-documented package produced by Lehoucq, Sorensen and Yang during 1992-1997 used in MATLAB as eigs and arpackc

slide-13
SLIDE 13

IRA for QEP

For simplicity, we only discuss QEPs. For QEP: (λ2M + λD + K)x = 0

1 shift-and-invert: for shift σ, let λ = σ + 1/µ

(µ2I − µA − B)x = 0 where A = −(σ2M + σD + K)−1(2σM + D) B = −(σ2M + σD + K)−1M

2 linearize

A B I µx x

  • = µ

µx x

  • 3 apply IRA
slide-14
SLIDE 14

IRA for QEP

Easy use of ARPACK How to utilize the Frobenius structure to save memory?

slide-15
SLIDE 15

Outline

1 Background 2 Algorithm 3 Numerical Comparison

slide-16
SLIDE 16

Arnoldi decomposition for QEP

An Arnoldi decomposition with order-j OPQj = Qj+1 Hj For QEP: A B I Qj,1 Qj,2

  • =

Qj+1,1 Qj+1,2

  • Hj

Since Qj,1 = Qj+1,2 Hj we have Theorem rank ([Qj,1, Qj,2]) ≡ rj ≤ j + 1 Observed by many people, e.g. Meerbergen [SIMAX06], Bai & S. [SIMAX05]. The key is how to use it with numerical stability.

slide-17
SLIDE 17

Arnoldi Decomposition for QEP

Theorem rank ([Qj,1, Qj,2]) ≡ rj ≤ j + 1 Let Vj ∈ C n×rj = orth[Qj,1, Qj,2] then Qj = Qj,1 Qj,2

  • =

VjRj,1 VjRj,2

  • =

Vj Vj Rj,1 Rj,2

  • Two levels of orthonormality:

Vj is orthonormal Rj,1 Rj,2

  • is orthonormal
slide-18
SLIDE 18

Compact ARnoldi Decomposition (CARD)

Compact ARnoldi Decomposition (CARD) OP Vj Vj Rj,1 Rj,2

  • =

Vj+1 Vj+1 Rj+1,1 Rj+1,2

  • Hj

Vj ∈ C n×rj Rj ∈ C rj×j j ≤ rj+1 ≤ j + 1 Memory cost: Arnoldi: 2n(j + 1) ( for PEPs: dn(j + 1)) CARD: nrj+1 ≤ n(j + 2) (for PEPs: ≤ n(j + d + 1))

slide-19
SLIDE 19

CARD process

CARD process is to compute the CARD with numerical stability! CARD of order j: OP VR1 VR2

  • =

VR1 VR2

  • H + βqe1 =

ˆ V ˆ R1 ˆ V ˆ R2

  • ˆ

H Expand it to a CARD of order j + 1 ⇒ next two pages

slide-20
SLIDE 20

Expand CARD process

1 compute q1 = Aq1 + Bq2; 2 decompose q1 = ˆ

V x + vα with MGS x = ˆ V Tq1, v = q1 − ˆ V x, α = v, v = v/α

3 update

ˆ V =

  • ˆ

V , v

  • , rj+1 = rj + 1,

ˆ R1 = ˆ R1 x α

  • , ˆ

R2 = ˆ R2 ˆ R2 (:, j + 1)

slide-21
SLIDE 21

Expand CARD process

ˆ R1 ˆ R2

  • :=

    ˆ R1 x α ˆ R2 ˆ R1 (:, j + 1)    

4 decompose with MGS:

ˆ R1 (:, j + 2) ˆ R2 (:, j + 2)

  • ld

= ˆ R1 (:, 1 : j + 1) ˆ R2 (:, 1 : j + 1)

  • H1:j+1,j+1

+ ˆ R1 (:, j + 2) ˆ R2 (:, j + 2)

  • new

Hj+2,j+1

slide-22
SLIDE 22

5 update the current Arnoldi vector q:

q = q1 q2

  • q1 = ˆ

V ˆ Rj+1,1 [:, j + 2] q2 = ˆ V ˆ Rj+1,2 [:, j + 2] Only GMS (with re-orthogonalization), no inversion CARD is numerically stable!

slide-23
SLIDE 23

IRA with CARD

Given a CARD of k-order: OP VR1 VR2

  • =

ˆ V ˆ R1 ˆ V ˆ R2 H βeT

k

  • IRA does (m − p) QR steps on H with shifts µp+1, ..., µm, i.e.

H = U HUH Then OP VR1 VR2

  • U =

ˆ V ˆ R1 ˆ V ˆ R2 U ˜ H βeT

k U

  • Denote

ˆ U = U 1

slide-24
SLIDE 24

IRA with CARD

Then OP VR1U VR2U

  • =

ˆ V ˆ R1 ˆ U ˆ V ˆ R2 ˆ U ˜ H βeT

k U

  • Its first p columns , denoted by

OP VkR1,p VkR2,p

  • =

Vk+1R1,p+1 Vk+1R2,p+1 Hp ˜ βeT

p

  • still form an Arnoldi decomposition of order p

However, the Vk has rk (instead of rp) columns, it is not a CARD!

slide-25
SLIDE 25

IRA with CARD

Since Vk+1R1,p+1 Vk+1R2,p+1

  • is the orthonormal factor of an Arnoldi decomposition, from

previous theorem, rank[Vk+1R1,p+1, Vk+1R2,p+1] = rank[R1,p+1, R2,p+1] = rp+1 ≤ p + 2, we have a compact SVD: [R1,p+1, R2,p+1] = Prk+1×rp+1Σrp+1×rp+1[G rp+1×(p+1)

1

, G rp+1×(p+1)

2

] ≡ P[R1, R2]

slide-26
SLIDE 26

IRA with CARD

Therefore,

  • V n×rk+1

k+1

R1,p+1 Vk+1R2,p+1

  • =

(Vk+1P)n×rp+1R1 (Vk+1P)R2

  • The Arnoldi decomposition is expressed in CARD again!

This process can also be implemented by a compact “QR” decomposition, which is similar with the compact Arnoldi decomposition (CARD). Details omitted.

slide-27
SLIDE 27

POLYAR

POLYAR: modified ARPARK for polynomial eigenvalue problems (not only QEPs)

1 znaitr p: compute CARD 2 znapps p: IRA with CARD;

use LAPACK routine zgesdd to compute SVD decomposition

3 znaupd p, znaupd2 p, zgetv0 p:

slightly revised (arguments, storage)

4 zgemip(added): compute inner product in compact form

slide-28
SLIDE 28

Outline

1 Background 2 Algorithm 3 Numerical Comparison

slide-29
SLIDE 29

Example 1: A random QEP

Problem: QEP (pdeg=2) Size: n=500 Environment: PC (EMS memory: 512M) Randomized Matrix M,D,K (each matrix have about 24,000 non-zero elements.) We choose shift σ = 1 and use shift-invert mode to compute eigenvalues close to 1 LU factorization of

  • Mσ2 + Dσ + K
  • , L and U contain about

120,000 non-zero elements. To compute 8 eigenvalues close to 1; Use 30 Arnoldi base vectors, says, 31 CARD base vectors.

slide-30
SLIDE 30

A random QEP

Computed eigenvalues: ARPACK POLYAR Real Imag Real Imag 1

1.02817D+00 1.38768D−01 1.02817D+00 1.38768D-01

2

1.11582D+00 3.61818D-02 1.11582D+00 3.61818D-02

3

1.11582D+00

  • 3.61818D-02

1.11582D+00

  • 3.61818D-02

4

1.05613D+00

  • 1.05380D-01

1.05613D+00

  • 1.05380D-01

5

1.05613D+00 1.05380D-01 1.05613D+00 1.05380D-01

6

9.34692D-01

  • 2.73028D-15

9.34692D-01 4.39496D-15

7

1.00023D+00 5.87804D-02 1.00023D+00 5.87804D-02

8

1.00023D+00

  • 5.87804D-02

1.00023D+00

  • 5.87804D-02
slide-31
SLIDE 31

A random QEP

Storage comparison: ARPACK POLYAR V 500 × 2 × 30 500 × 31 workd 3 × 2 × 500 (2 + 2) × 500 Resid 500 × 2 500 × 2

slide-32
SLIDE 32

A random QEP

Iteration and time comparison: ARPACK POLYAR update iteration 4 4 OP × x operations 86 86 reorthogonalization of V 84 84 reorthogonalization of R 84 user’s OP × x operations 1.375000 1.250000 naupd2 1.609375 1.515625 basic Arnoldi iteration loop 1.531250 1.437500 reorthogonalization phrase 0.093750 0.046875 Hessenberg eig subproblem 0.031250 0.031250 applying the shifts 0.046875 0.046875 calling gesdd 0.000000

slide-33
SLIDE 33

Example 2: A QEP from SLAC

Problem: QEP (pdeg=2) Size: n=5384 Environment: PC (EMS memory: 512M) Genuine Matrix M,D,K (M,D,K have 61425,1183,61425 non-zero elements respectively.) We choose shift σ = −10i and use shift-invert mode to compute eigenvalues close to −10i LU factorization of

  • Mσ2 + Dσ + K
  • , L and U have 749610

and 780229 non-zero elements. To compute 8 eigenvalues close to −10i; Use 30 Arnoldi base vectors, says, 31 CARD base vectors.

slide-34
SLIDE 34

A QEP from SLAC

ARPACK POLYAR Real Imag Real Imag 1

  • 8.17408D-13
  • 1.11248D+01

1.78859D-13

  • 1.11248D+01

2

  • 1.80351D-13
  • 1.10381D+01

1.48935D-13

  • 1.10381D+01

3

3.43204D-12

  • 8.96999D+00

1.85159D-12

  • 8.96999D+00

4

8.92409D-12

  • 1.09810D+01

2.12829D-12

  • 1.09810D+01

5

1.19634D-12

  • 1.07600D+01

2.63738D-13

  • 1.07600D+01

6

  • 1.30082D-13
  • 9.73674D+00
  • 4.44236D-14
  • 9.73674D+00

7

3.83685D-15

  • 9.79517D+00

1.48931D-15

  • 9.79517D+00

8

2.27178D-15

  • 1.00606D+01
  • 3.76020D-15
  • 1.00606D+01
slide-35
SLIDE 35

A QEP from SLAC

Storage comparison: ARPACK POLYAR V 5000 × 2 × 30 5000 × 31 workd 3 × 2 × 5000 (2 + 2) × 5000 Resid 5000 × 2 5000 × 2

slide-36
SLIDE 36

A QEP from SLAC

Time comparison: ARPACK POLYAR update iteration 2 2 OP × x operations 48 48 reorthogonalization of V 44 44 reorthogonalization of R 47 user’s OP × x operations 4.078125 4.093750 naupd2 5.343750 5.171875 basic Arnoldi iteration loop 5.203125 5.062500 reorthogonalization phrase 0.593750 0.250000 Hessenberg eig subproblem 0.015625 0.015625 applying the shifts 0.109375 0.078125 calling gesdd 0.000000

slide-37
SLIDE 37

Thank you for your attention!