a 11 a 1 n . . ... . . A = . . a n 1 a nn - - PowerPoint PPT Presentation

a 11 a 1 n a a n 1 a nn be an n n matrix with complex
SMART_READER_LITE
LIVE PREVIEW

a 11 a 1 n . . ... . . A = . . a n 1 a nn - - PowerPoint PPT Presentation

Lesson 16 L EAST S QUARES Let a 11 a 1 n . . ... . . A = . . a n 1 a nn be an n n matrix with complex entries (i.e., A C n n ) It is helpful to view A as a row-vector whose


slide-1
SLIDE 1

LEAST SQUARES

Lesson 16

slide-2
SLIDE 2
  • Let

A =

  • a11

· · · a1n . . . ... . . . an1 · · · ann

  • be an n × n matrix with complex entries (i.e., A ∈ Cn×n)

It is helpful to view A as a row-vector whose columns are in Cn: A = a1 | · · · | an

  • Recall that if

is nonsingular, then we can always solve the linear system for . . . . . .

  • What if

is singular? Can we find so that is "close" to ?

  • In other words, for

. . . , we want

slide-3
SLIDE 3
  • Let

A =

  • a11

· · · a1n . . . ... . . . an1 · · · ann

  • be an n × n matrix with complex entries (i.e., A ∈ Cn×n)

It is helpful to view A as a row-vector whose columns are in Cn: A = a1 | · · · | an

  • Recall that if A is nonsingular, then we can always solve the linear system

Ac = b, for b =

  • b1

. . . bn

  • , c =
  • c1

. . . cn

  • ∈ Cn
  • What if

is singular? Can we find so that is "close" to ?

  • In other words, for

. . . , we want

slide-4
SLIDE 4
  • Let

A =

  • a11

· · · a1n . . . ... . . . an1 · · · ann

  • be an n × n matrix with complex entries (i.e., A ∈ Cn×n)

It is helpful to view A as a row-vector whose columns are in Cn: A = a1 | · · · | an

  • Recall that if A is nonsingular, then we can always solve the linear system

Ac = b, for b =

  • b1

. . . bn

  • , c =
  • c1

. . . cn

  • ∈ Cn
  • What if A is singular? Can we find c so that Ac is "close" to b?
  • In other words, for c =
  • c1

. . . cn

  • , we want

c1a1 + · · · + cnan ≈ b

slide-5
SLIDE 5
  • More generally, let A Cm×n:

A = a1 | · · · | an

  • for

ak Cm

  • Can we numerically compute c1, . . . , cn so that

c1a1 + · · · cnan b

  • More precisely, we find

such that takes its minimal value

slide-6
SLIDE 6
  • More generally, let A Cm×n:

A = a1 | · · · | an

  • for

ak Cm

  • Can we numerically compute c1, . . . , cn so that

c1a1 + · · · cnan b

  • More precisely, we find c such that

Ac b2 takes its minimal value

slide-7
SLIDE 7
  • Let's review the real A Rmn, c Rn and b Rm
  • Minimizing Ac b is equivalent to minimizing Ac b2
  • We simplify
  • We can heuristically assume that the minimum is a stationary point of this equation;

i.e., we want

slide-8
SLIDE 8
  • Let's review the real A Rmn, c Rn and b Rm
  • Minimizing Ac b is equivalent to minimizing Ac b2
  • We simplify

Ac b2 = (Ac b)(Ac b)

  • We can heuristically assume that the minimum is a stationary point of this equation;

i.e., we want

slide-9
SLIDE 9
  • Let's review the real A Rmn, c Rn and b Rm
  • Minimizing Ac b is equivalent to minimizing Ac b2
  • We simplify

Ac b2 = (Ac b)(Ac b) = Ac2 (Ac)b bAc + b2

  • We can heuristically assume that the minimum is a stationary point of this equation;

i.e., we want

slide-10
SLIDE 10
  • Let's review the real A Rmn, c Rn and b Rm
  • Minimizing Ac b is equivalent to minimizing Ac b2
  • We simplify

Ac b2 = (Ac b)(Ac b) = Ac2 (Ac)b bAc + b2 = cAAc 2cAb + b2

  • We can heuristically assume that the minimum is a stationary point of this equation;

i.e., we want

slide-11
SLIDE 11
  • Let's review the real A Rmn, c Rn and b Rm
  • Minimizing Ac b is equivalent to minimizing Ac b2
  • We simplify

Ac b2 = (Ac b)(Ac b) = Ac2 (Ac)b bAc + b2 = cAAc 2cAb + b2

  • We can heuristically assume that the minimum is a stationary point of this equation;

i.e., we want 0 = c Ac b2

slide-12
SLIDE 12

: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :

  • We first remark that

is positive definite, i.e., for all (real) . (Why?)

  • Minimizing

is equivalent to minimizing .

  • For all

, we have

slide-13
SLIDE 13

: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :

  • We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.

(Why?)

  • Minimizing

is equivalent to minimizing .

  • For all

, we have

slide-14
SLIDE 14

: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :

  • We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.

(Why?)

  • Minimizing Ac b is equivalent to minimizing Ac b2.
  • For all

, we have

slide-15
SLIDE 15

: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :

  • We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.

(Why?)

  • Minimizing Ac b is equivalent to minimizing Ac b2.
  • For all x, we have

A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2

slide-16
SLIDE 16

: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :

  • We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.

(Why?)

  • Minimizing Ac b is equivalent to minimizing Ac b2.
  • For all x, we have

A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2

slide-17
SLIDE 17

: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :

  • We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.

(Why?)

  • Minimizing Ac b is equivalent to minimizing Ac b2.
  • For all x, we have

A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2 = xAAx + cAAc + 2xAAc 2(c + x)Ab + b2

slide-18
SLIDE 18

: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :

  • We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.

(Why?)

  • Minimizing Ac b is equivalent to minimizing Ac b2.
  • For all x, we have

A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2 = xAAx + cAAc + 2xAAc 2(c + x)Ab + b2 = xAAx + cAAc + 2xAA(AA)1Ab 2(c + x)Ab + b2

slide-19
SLIDE 19

: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :

  • We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.

(Why?)

  • Minimizing Ac b is equivalent to minimizing Ac b2.
  • For all x, we have

A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2 = xAAx + cAAc + 2xAAc 2(c + x)Ab + b2 = xAAx + cAAc + 2xAA(AA)1Ab 2(c + x)Ab + b2 = xAAx + cAAc + 2xAb 2(c + x)Ab + b2

slide-20
SLIDE 20

: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :

  • We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.

(Why?)

  • Minimizing Ac b is equivalent to minimizing Ac b2.
  • For all x, we have

A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2 = xAAx + cAAc + 2xAAc 2(c + x)Ab + b2 = xAAx + cAAc + 2xAA(AA)1Ab 2(c + x)Ab + b2 = xAAx + cAAc + 2xAb 2(c + x)Ab + b2 = xAAx + cAAc 2cAb + b2

slide-21
SLIDE 21

: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :

  • We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.

(Why?)

  • Minimizing Ac b is equivalent to minimizing Ac b2.
  • For all x, we have

A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2 = xAAx + cAAc + 2xAAc 2(c + x)Ab + b2 = xAAx + cAAc + 2xAA(AA)1Ab 2(c + x)Ab + b2 = xAAx + cAAc + 2xAb 2(c + x)Ab + b2 = xAAx + cAAc 2cAb + b2

Independent of x! Minimized when x is zero!

slide-22
SLIDE 22

GENERAL INNER PRODUCT SPACES

slide-23
SLIDE 23
  • Consider a row vector of elements v1, . . . , vn V :

A = a1 | · · · | an

  • We can associate with A a Gram matrix

K =

  • a1, a1

· · · a1, an . . . ... . . . a1, an · · · an, an

  • In the case where V = Rm, the Gram matrix is precisely the matrix we used in

least squares K = AA

  • In the case where V = Cm, we get the similar

K = AA

slide-24
SLIDE 24

: The Gram matrix K is Hermitian: K = K. :

  • Follows from the fact that u, v = u, v:

K =    a1, a1 · · · an, a1 . . . ... . . . a1, an · · · an, an     . . . ... . . . 

slide-25
SLIDE 25

: The Gram matrix K is Hermitian: K = K. :

  • Follows from the fact that u, v = u, v:

K =    a1, a1 · · · an, a1 . . . ... . . . a1, an · · · an, an    =    a1, a1 · · · a1, an . . . ... . . . an, a1 · · · an, an    = K

slide-26
SLIDE 26

: Let A =

  • a1 | · · · | an
  • and K denote the associated Gram matrix. For

x, y Cn we have Ay, Ax = yKx. :

  • .

. . ... . . . . . .

slide-27
SLIDE 27

: Let A =

  • a1 | · · · | an
  • and K denote the associated Gram matrix. For

x, y Cn we have Ay, Ax = yKx. :

  • yKx = y
  • a1, a1

· · · a1, an . . . ... . . . an, a1 · · · an, an

  • x
  • .

. .

slide-28
SLIDE 28

: Let A =

  • a1 | · · · | an
  • and K denote the associated Gram matrix. For

x, y Cn we have Ay, Ax = yKx. :

  • yKx = y
  • a1, a1

· · · a1, an . . . ... . . . an, a1 · · · an, an

  • x

= y

  • a1, a1x1 + · · · + anxn

. . . an, a1x1 + · · · + anxn

slide-29
SLIDE 29

: Let A =

  • a1 | · · · | an
  • and K denote the associated Gram matrix. For

x, y Cn we have Ay, Ax = yKx. :

  • yKx = y
  • a1, a1

· · · a1, an . . . ... . . . an, a1 · · · an, an

  • x

= y

  • a1, a1x1 + · · · + anxn

. . . an, a1x1 + · · · + anxn

  • = a1y1 + · · · + anyn, a1x1 + · · · + anxn
slide-30
SLIDE 30

: Let A =

  • a1 | · · · | an
  • and K denote the associated Gram matrix. For

x, y Cn we have Ay, Ax = yKx. :

  • yKx = y
  • a1, a1

· · · a1, an . . . ... . . . an, a1 · · · an, an

  • x

= y

  • a1, a1x1 + · · · + anxn

. . . an, a1x1 + · · · + anxn

  • = a1y1 + · · · + anyn, a1x1 + · · · + anxn

= Ay, Ax

slide-31
SLIDE 31

: The Gram matrix K is positive semi-definite. If the vectors a1, . . . , an are linearly independent, then the Gram matrix is positive definite. :

  • Linear independence of

shows that if and only if

slide-32
SLIDE 32

: The Gram matrix K is positive semi-definite. If the vectors a1, . . . , an are linearly independent, then the Gram matrix is positive definite. :

  • xKx = Ax, Ax = Ax2 0
  • Linear independence of

shows that if and only if

slide-33
SLIDE 33

: The Gram matrix K is positive semi-definite. If the vectors a1, . . . , an are linearly independent, then the Gram matrix is positive definite. :

  • xKx = Ax, Ax = Ax2 0
  • Linear independence of a1, . . . , an shows that

Ax = a1x1 + · · · + anxn = 0 if and only if x =

slide-34
SLIDE 34

Ax, b = n

  • k=1

xkak, b

  • =

n

  • k=1

¯ xk ak, b = x

  • a1, b

. . . an, b

  • Two more useful relationships:
slide-35
SLIDE 35

Ax, b = n

  • k=1

xkak, b

  • =

n

  • k=1

¯ xk ak, b = x

  • a1, b

. . . an, b

  • b, Ax =
  • b,

n

  • k=1

xkak

  • =

n

  • k=1

xk b, ak =

n

  • k=1

xkak, b =

  • a1, b

. . . an, b

  • x

Two more useful relationships:

slide-36
SLIDE 36

: Suppose the columns of A = a1 | . . . | an

  • are linearly independent

in an inner product space V . Let K be the associated Gram matrix. The vector c = K−1

  • a1, b

. . . an, b

  • is the unique minimizer of

Ac b (where the norm is the norm associated with the inner product) :

Exercise

slide-37
SLIDE 37

ORTHONORMAL VECTORS AND CALCULATING

LEAST SQUARES APPROXIMATIONS

slide-38
SLIDE 38
  • A set of nonzero vectors v1, . . . , vn are called orthogonal if

vi, vj = 0 whenever i = k.

  • They are called orthonormal if they are orthogonal and all vectors are of unit norm:

1 = vi , or equivalently, vi, vi = 1.

  • The Gram matrix of orthonormal vectors is the identity!

K =    v1, v1 · · · v1, vn . . . ... . . . vn, v1 · · · vn, vn    =    1 ... 1    = I

slide-39
SLIDE 39

ORTHOGONAL AND UNITARY MATRICES

slide-40
SLIDE 40
  • A matrix Q ∈ Rnn is orthogonal provided that

QQ = I

  • A matrix Q ∈ Cnn is unitary provided that

QQ = I Note that every orthogonal matrix is also unitary but not vice-versa

  • In other words Q1 = Q.
  • Every unitary matrix also satisfies

QQ = I

slide-41
SLIDE 41
  • Multiplying a vector by a unitary matrix does not change the 2 norm:

Qv2 = (Qv)(Qv) = vQQv = vv = v2

  • Thus we view unitary matrices as generalizations of rotations and reflections

Exercise: show that every rotation and reflection in R2 corresponds to a

  • rthogonal matrix
  • This means that, for least squares, if

is minimal if and only if is also minimal

  • Question: can we compute a

so that the latter is easier?

slide-42
SLIDE 42
  • Multiplying a vector by a unitary matrix does not change the 2 norm:

Qv2 = (Qv)(Qv) = vQQv = vv = v2

  • Thus we view unitary matrices as generalizations of rotations and reflections

Exercise: show that every rotation and reflection in R2 corresponds to a

  • rthogonal matrix
  • This means that, for least squares, if

Av b is minimal if and only if QAv Qb is also minimal

  • Question: can we compute a Q so that the latter is easier?
slide-43
SLIDE 43

LEAST SQUARES FOR UPPER TRIANGULAR

MATRICES

slide-44
SLIDE 44
  • Suppose rectangular matrix R Cmn for m > n is upper triangular:

R =

  • r11

· · · r1n ... . . . rnn . . .

  • =

ˆ R

  • where ˆ

R is n n

  • Then

is minimized by

  • Thus least squares can be solved by simple backsubstitution
slide-45
SLIDE 45
  • Suppose rectangular matrix R Cmn for m > n is upper triangular:

R =

  • r11

· · · r1n ... . . . rnn . . .

  • =

ˆ R

  • where ˆ

R is n n

  • Then Rc b is minimized by

c = (RR)1Rb = ( ˆ R ˆ R)1Rb

  • Thus least squares can be solved by simple backsubstitution
slide-46
SLIDE 46
  • Suppose rectangular matrix R Cmn for m > n is upper triangular:

R =

  • r11

· · · r1n ... . . . rnn . . .

  • =

ˆ R

  • where ˆ

R is n n

  • Then Rc b is minimized by

c = (RR)1Rb = ( ˆ R ˆ R)1Rb = ˆ R1 ˆ R ˆ R,

  • b
  • Thus least squares can be solved by simple backsubstitution
slide-47
SLIDE 47
  • Suppose rectangular matrix R Cmn for m > n is upper triangular:

R =

  • r11

· · · r1n ... . . . rnn . . .

  • =

ˆ R

  • where ˆ

R is n n

  • Then Rc b is minimized by

c = (RR)1Rb = ( ˆ R ˆ R)1Rb = ˆ R1 ˆ R ˆ R,

  • b

=

  • ˆ

R1,

  • b

Thus least squares can be solved by simple backsubstitution

slide-48
SLIDE 48
  • Suppose we have a QR decomposition of a rectangular matrix A Cm×n:

A = QR where Q Cm×m is unitary and R Cm×n is upper triangular

  • Then minimizing

is equivalent to minimizing

  • Thus we have reduced solving least squares problems to calculating a QR decom-

position We will see this is much more accurate than constructing and inverting

  • Next lecture we will discuss computation of QR decompositions.
slide-49
SLIDE 49
  • Suppose we have a QR decomposition of a rectangular matrix A Cm×n:

A = QR where Q Cm×m is unitary and R Cm×n is upper triangular

  • Then minimizing Ac b is equivalent to minimizing

QAc Qb = Rc Qb

  • Thus we have reduced solving least squares problems to calculating a QR decom-

position We will see this is much more accurate than constructing and inverting

  • Next lecture we will discuss computation of QR decompositions.
slide-50
SLIDE 50
  • Suppose we have a QR decomposition of a rectangular matrix A Cm×n:

A = QR where Q Cm×m is unitary and R Cm×n is upper triangular

  • Then minimizing Ac b is equivalent to minimizing

QAc Qb = Rc Qb

  • Thus we have reduced solving least squares problems to calculating a QR decom-

position We will see this is much more accurate than constructing and inverting (AA)−1A

  • Next lecture we will discuss computation of QR decompositions.