a 11 a 1 n . . ... . . A = . . a n 1 a nn - - PowerPoint PPT Presentation
a 11 a 1 n . . ... . . A = . . a n 1 a nn - - PowerPoint PPT Presentation
Lesson 16 L EAST S QUARES Let a 11 a 1 n . . ... . . A = . . a n 1 a nn be an n n matrix with complex entries (i.e., A C n n ) It is helpful to view A as a row-vector whose
- Let
A =
- a11
· · · a1n . . . ... . . . an1 · · · ann
- be an n × n matrix with complex entries (i.e., A ∈ Cn×n)
It is helpful to view A as a row-vector whose columns are in Cn: A = a1 | · · · | an
- Recall that if
is nonsingular, then we can always solve the linear system for . . . . . .
- What if
is singular? Can we find so that is "close" to ?
- In other words, for
. . . , we want
- Let
A =
- a11
· · · a1n . . . ... . . . an1 · · · ann
- be an n × n matrix with complex entries (i.e., A ∈ Cn×n)
It is helpful to view A as a row-vector whose columns are in Cn: A = a1 | · · · | an
- Recall that if A is nonsingular, then we can always solve the linear system
Ac = b, for b =
- b1
. . . bn
- , c =
- c1
. . . cn
- ∈ Cn
- What if
is singular? Can we find so that is "close" to ?
- In other words, for
. . . , we want
- Let
A =
- a11
· · · a1n . . . ... . . . an1 · · · ann
- be an n × n matrix with complex entries (i.e., A ∈ Cn×n)
It is helpful to view A as a row-vector whose columns are in Cn: A = a1 | · · · | an
- Recall that if A is nonsingular, then we can always solve the linear system
Ac = b, for b =
- b1
. . . bn
- , c =
- c1
. . . cn
- ∈ Cn
- What if A is singular? Can we find c so that Ac is "close" to b?
- In other words, for c =
- c1
. . . cn
- , we want
c1a1 + · · · + cnan ≈ b
- More generally, let A Cm×n:
A = a1 | · · · | an
- for
ak Cm
- Can we numerically compute c1, . . . , cn so that
c1a1 + · · · cnan b
- More precisely, we find
such that takes its minimal value
- More generally, let A Cm×n:
A = a1 | · · · | an
- for
ak Cm
- Can we numerically compute c1, . . . , cn so that
c1a1 + · · · cnan b
- More precisely, we find c such that
Ac b2 takes its minimal value
- Let's review the real A Rmn, c Rn and b Rm
- Minimizing Ac b is equivalent to minimizing Ac b2
- We simplify
- We can heuristically assume that the minimum is a stationary point of this equation;
i.e., we want
- Let's review the real A Rmn, c Rn and b Rm
- Minimizing Ac b is equivalent to minimizing Ac b2
- We simplify
Ac b2 = (Ac b)(Ac b)
- We can heuristically assume that the minimum is a stationary point of this equation;
i.e., we want
- Let's review the real A Rmn, c Rn and b Rm
- Minimizing Ac b is equivalent to minimizing Ac b2
- We simplify
Ac b2 = (Ac b)(Ac b) = Ac2 (Ac)b bAc + b2
- We can heuristically assume that the minimum is a stationary point of this equation;
i.e., we want
- Let's review the real A Rmn, c Rn and b Rm
- Minimizing Ac b is equivalent to minimizing Ac b2
- We simplify
Ac b2 = (Ac b)(Ac b) = Ac2 (Ac)b bAc + b2 = cAAc 2cAb + b2
- We can heuristically assume that the minimum is a stationary point of this equation;
i.e., we want
- Let's review the real A Rmn, c Rn and b Rm
- Minimizing Ac b is equivalent to minimizing Ac b2
- We simplify
Ac b2 = (Ac b)(Ac b) = Ac2 (Ac)b bAc + b2 = cAAc 2cAb + b2
- We can heuristically assume that the minimum is a stationary point of this equation;
i.e., we want 0 = c Ac b2
: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :
- We first remark that
is positive definite, i.e., for all (real) . (Why?)
- Minimizing
is equivalent to minimizing .
- For all
, we have
: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :
- We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.
(Why?)
- Minimizing
is equivalent to minimizing .
- For all
, we have
: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :
- We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.
(Why?)
- Minimizing Ac b is equivalent to minimizing Ac b2.
- For all
, we have
: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :
- We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.
(Why?)
- Minimizing Ac b is equivalent to minimizing Ac b2.
- For all x, we have
A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2
: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :
- We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.
(Why?)
- Minimizing Ac b is equivalent to minimizing Ac b2.
- For all x, we have
A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2
: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :
- We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.
(Why?)
- Minimizing Ac b is equivalent to minimizing Ac b2.
- For all x, we have
A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2 = xAAx + cAAc + 2xAAc 2(c + x)Ab + b2
: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :
- We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.
(Why?)
- Minimizing Ac b is equivalent to minimizing Ac b2.
- For all x, we have
A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2 = xAAx + cAAc + 2xAAc 2(c + x)Ab + b2 = xAAx + cAAc + 2xAA(AA)1Ab 2(c + x)Ab + b2
: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :
- We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.
(Why?)
- Minimizing Ac b is equivalent to minimizing Ac b2.
- For all x, we have
A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2 = xAAx + cAAc + 2xAAc 2(c + x)Ab + b2 = xAAx + cAAc + 2xAA(AA)1Ab 2(c + x)Ab + b2 = xAAx + cAAc + 2xAb 2(c + x)Ab + b2
: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :
- We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.
(Why?)
- Minimizing Ac b is equivalent to minimizing Ac b2.
- For all x, we have
A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2 = xAAx + cAAc + 2xAAc 2(c + x)Ab + b2 = xAAx + cAAc + 2xAA(AA)1Ab 2(c + x)Ab + b2 = xAAx + cAAc + 2xAb 2(c + x)Ab + b2 = xAAx + cAAc 2cAb + b2
: Suppose A has linearly independent columns. The vector c = (AA)1Ab is the unique minimizer of Ac b :
- We first remark that AA is positive definite, i.e., xAAx > 0 for all (real) x.
(Why?)
- Minimizing Ac b is equivalent to minimizing Ac b2.
- For all x, we have
A(c + x) b2 = (c + x)AA(c + x) 2(c + x)Ab + b2 = cAA(c + x) + xAA(c + x) 2(c + x)b + b2 = xAAx + cAAc + 2xAAc 2(c + x)Ab + b2 = xAAx + cAAc + 2xAA(AA)1Ab 2(c + x)Ab + b2 = xAAx + cAAc + 2xAb 2(c + x)Ab + b2 = xAAx + cAAc 2cAb + b2
Independent of x! Minimized when x is zero!
GENERAL INNER PRODUCT SPACES
- Consider a row vector of elements v1, . . . , vn V :
A = a1 | · · · | an
- We can associate with A a Gram matrix
K =
- a1, a1
· · · a1, an . . . ... . . . a1, an · · · an, an
- In the case where V = Rm, the Gram matrix is precisely the matrix we used in
least squares K = AA
- In the case where V = Cm, we get the similar
K = AA
: The Gram matrix K is Hermitian: K = K. :
- Follows from the fact that u, v = u, v:
K = a1, a1 · · · an, a1 . . . ... . . . a1, an · · · an, an . . . ... . . .
: The Gram matrix K is Hermitian: K = K. :
- Follows from the fact that u, v = u, v:
K = a1, a1 · · · an, a1 . . . ... . . . a1, an · · · an, an = a1, a1 · · · a1, an . . . ... . . . an, a1 · · · an, an = K
: Let A =
- a1 | · · · | an
- and K denote the associated Gram matrix. For
x, y Cn we have Ay, Ax = yKx. :
- .
. . ... . . . . . .
: Let A =
- a1 | · · · | an
- and K denote the associated Gram matrix. For
x, y Cn we have Ay, Ax = yKx. :
- yKx = y
- a1, a1
· · · a1, an . . . ... . . . an, a1 · · · an, an
- x
- .
. .
: Let A =
- a1 | · · · | an
- and K denote the associated Gram matrix. For
x, y Cn we have Ay, Ax = yKx. :
- yKx = y
- a1, a1
· · · a1, an . . . ... . . . an, a1 · · · an, an
- x
= y
- a1, a1x1 + · · · + anxn
. . . an, a1x1 + · · · + anxn
: Let A =
- a1 | · · · | an
- and K denote the associated Gram matrix. For
x, y Cn we have Ay, Ax = yKx. :
- yKx = y
- a1, a1
· · · a1, an . . . ... . . . an, a1 · · · an, an
- x
= y
- a1, a1x1 + · · · + anxn
. . . an, a1x1 + · · · + anxn
- = a1y1 + · · · + anyn, a1x1 + · · · + anxn
: Let A =
- a1 | · · · | an
- and K denote the associated Gram matrix. For
x, y Cn we have Ay, Ax = yKx. :
- yKx = y
- a1, a1
· · · a1, an . . . ... . . . an, a1 · · · an, an
- x
= y
- a1, a1x1 + · · · + anxn
. . . an, a1x1 + · · · + anxn
- = a1y1 + · · · + anyn, a1x1 + · · · + anxn
= Ay, Ax
: The Gram matrix K is positive semi-definite. If the vectors a1, . . . , an are linearly independent, then the Gram matrix is positive definite. :
- Linear independence of
shows that if and only if
: The Gram matrix K is positive semi-definite. If the vectors a1, . . . , an are linearly independent, then the Gram matrix is positive definite. :
- xKx = Ax, Ax = Ax2 0
- Linear independence of
shows that if and only if
: The Gram matrix K is positive semi-definite. If the vectors a1, . . . , an are linearly independent, then the Gram matrix is positive definite. :
- xKx = Ax, Ax = Ax2 0
- Linear independence of a1, . . . , an shows that
Ax = a1x1 + · · · + anxn = 0 if and only if x =
Ax, b = n
- k=1
xkak, b
- =
n
- k=1
¯ xk ak, b = x
- a1, b
. . . an, b
- Two more useful relationships:
Ax, b = n
- k=1
xkak, b
- =
n
- k=1
¯ xk ak, b = x
- a1, b
. . . an, b
- b, Ax =
- b,
n
- k=1
xkak
- =
n
- k=1
xk b, ak =
n
- k=1
xkak, b =
- a1, b
. . . an, b
- x
Two more useful relationships:
: Suppose the columns of A = a1 | . . . | an
- are linearly independent
in an inner product space V . Let K be the associated Gram matrix. The vector c = K−1
- a1, b
. . . an, b
- is the unique minimizer of
Ac b (where the norm is the norm associated with the inner product) :
Exercise
ORTHONORMAL VECTORS AND CALCULATING
LEAST SQUARES APPROXIMATIONS
- A set of nonzero vectors v1, . . . , vn are called orthogonal if
vi, vj = 0 whenever i = k.
- They are called orthonormal if they are orthogonal and all vectors are of unit norm:
1 = vi , or equivalently, vi, vi = 1.
- The Gram matrix of orthonormal vectors is the identity!
K = v1, v1 · · · v1, vn . . . ... . . . vn, v1 · · · vn, vn = 1 ... 1 = I
ORTHOGONAL AND UNITARY MATRICES
- A matrix Q ∈ Rnn is orthogonal provided that
QQ = I
- A matrix Q ∈ Cnn is unitary provided that
QQ = I Note that every orthogonal matrix is also unitary but not vice-versa
- In other words Q1 = Q.
- Every unitary matrix also satisfies
QQ = I
- Multiplying a vector by a unitary matrix does not change the 2 norm:
Qv2 = (Qv)(Qv) = vQQv = vv = v2
- Thus we view unitary matrices as generalizations of rotations and reflections
Exercise: show that every rotation and reflection in R2 corresponds to a
- rthogonal matrix
- This means that, for least squares, if
is minimal if and only if is also minimal
- Question: can we compute a
so that the latter is easier?
- Multiplying a vector by a unitary matrix does not change the 2 norm:
Qv2 = (Qv)(Qv) = vQQv = vv = v2
- Thus we view unitary matrices as generalizations of rotations and reflections
Exercise: show that every rotation and reflection in R2 corresponds to a
- rthogonal matrix
- This means that, for least squares, if
Av b is minimal if and only if QAv Qb is also minimal
- Question: can we compute a Q so that the latter is easier?
LEAST SQUARES FOR UPPER TRIANGULAR
MATRICES
- Suppose rectangular matrix R Cmn for m > n is upper triangular:
R =
- r11
· · · r1n ... . . . rnn . . .
- =
ˆ R
- where ˆ
R is n n
- Then
is minimized by
- Thus least squares can be solved by simple backsubstitution
- Suppose rectangular matrix R Cmn for m > n is upper triangular:
R =
- r11
· · · r1n ... . . . rnn . . .
- =
ˆ R
- where ˆ
R is n n
- Then Rc b is minimized by
c = (RR)1Rb = ( ˆ R ˆ R)1Rb
- Thus least squares can be solved by simple backsubstitution
- Suppose rectangular matrix R Cmn for m > n is upper triangular:
R =
- r11
· · · r1n ... . . . rnn . . .
- =
ˆ R
- where ˆ
R is n n
- Then Rc b is minimized by
c = (RR)1Rb = ( ˆ R ˆ R)1Rb = ˆ R1 ˆ R ˆ R,
- b
- Thus least squares can be solved by simple backsubstitution
- Suppose rectangular matrix R Cmn for m > n is upper triangular:
R =
- r11
· · · r1n ... . . . rnn . . .
- =
ˆ R
- where ˆ
R is n n
- Then Rc b is minimized by
c = (RR)1Rb = ( ˆ R ˆ R)1Rb = ˆ R1 ˆ R ˆ R,
- b
=
- ˆ
R1,
- b
Thus least squares can be solved by simple backsubstitution
- Suppose we have a QR decomposition of a rectangular matrix A Cm×n:
A = QR where Q Cm×m is unitary and R Cm×n is upper triangular
- Then minimizing
is equivalent to minimizing
- Thus we have reduced solving least squares problems to calculating a QR decom-
position We will see this is much more accurate than constructing and inverting
- Next lecture we will discuss computation of QR decompositions.
- Suppose we have a QR decomposition of a rectangular matrix A Cm×n:
A = QR where Q Cm×m is unitary and R Cm×n is upper triangular
- Then minimizing Ac b is equivalent to minimizing
QAc Qb = Rc Qb
- Thus we have reduced solving least squares problems to calculating a QR decom-
position We will see this is much more accurate than constructing and inverting
- Next lecture we will discuss computation of QR decompositions.
- Suppose we have a QR decomposition of a rectangular matrix A Cm×n:
A = QR where Q Cm×m is unitary and R Cm×n is upper triangular
- Then minimizing Ac b is equivalent to minimizing
QAc Qb = Rc Qb
- Thus we have reduced solving least squares problems to calculating a QR decom-
position We will see this is much more accurate than constructing and inverting (AA)−1A
- Next lecture we will discuss computation of QR decompositions.