[PPT] - lgebra Linear e Aplicaes LINEAR EQUATIONS Start with a few examples PowerPoint Presentation

SLIDE 1

Álgebra Linear e Aplicações

SLIDE 2

LINEAR EQUATIONS

SLIDE 3

Start with a few examples

One way of designing a reconstruction kernel
Cubic B-Spline
Leads to linear equations
Two ways of computing the sums of squares
Both lead to linear equations
One simpler than the other

SLIDE 4

Desirable properties

A smooth, piecewise polynomial function

P3(x) P3(−x) Q3(−x) Q3(x) 1 2

1
2

P 0

3(0) = 0

Q3(2) = 0 Q0

3(2) = 0

Q00

3(2) = 0

P3(1) = Q3(1) P 0

3(1) = Q0 3(1)

P 00

3 (1) = Q00 3(1)

Z 1 P3(x) dx + Z 2

1

Q3(x) dx = 1 2 P3(x) = ax3 + bx2 + cx + d Q3(x) = ex3 + fx2 + gx + h

SLIDE 5

Resulting linear equations

Substituting, we get the linear system

c = 0 8e + 4f + 2g + h = 0 12e + 4f + g = 0 12e + 2f = 0 a + b + c + d − e − f − g − h = 0 3a + 2b + c − 3e − 2f − g = 0 6a + 2b − 6e − 2f = 0

1 4a + 1 3b + 1 2c + d + 15 4 e + 7 3f + 3 2g + h = 1 2

SLIDE 6

Result: the Cubic B-Spline function

Solving, we get

P3(x) P3(−x) Q3(−x) Q3(x) 1 2

1
2

P3(x) = x3/2 − x2 + 2/3 Q3(x) = −x3/6 + x2 − 2x + 4/3

SLIDE 7

Sum of squares

Find a formula for the sum of squares
Seems to be a cubic polynomial
Use first 4 results to solve for a, b, c, and d

S2(n) =

n

X

i=1

i2 P2(n) = a n3 + b n2 + c n + d P2(0) = d = 0 = S2(0) P2(1) = a + b + c + d = 1 = S2(1) P2(2) = 8a + 4b + 2c + d = 5 = S2(2) P2(3) = 27a + 9b + 3c + d = 14 = S2(3)

SLIDE 8

Even simpler!

On one hand
On the other hand
Equating coefficient by coefficient

P2(n + 1) − P2(n) = 3a n2 + (3a + 2b) n + a + b + c S2(n + 1) − S2(n) = (n + 1)2 = n2 + 2n + 1 3a = 1 3a + 2b = 2 a + b + c = 1 d = 0 P2(n) = n (n + 1)(2n + 1) 6

SLIDE 9

Generalizing

Formulate general problem
Propose a solution strategy
Elimination
Back-substitution
Propose a convenient notation
Investigate the cost of solution

SLIDE 10

Systems of linear equations

General problem formulation
aij are the coefficients
bi is the right-hand side
xi are the variables or unknowns
m equations on n unknowns

a11x1 + a12x2 + · · · a1nxn = b1 a21x1 + a22x2 + · · · a2nxn = b2 . . . am1x1 + am2x2 + · · · amnxn = bm

SLIDE 11

Solutions to linear equations

There are only three possibilities
System has unique solution
System has no solution
System has infinitely many solutions
Easy to understand geometrically

SLIDE 12

The row-view of linear systems

Imagine m = 2 and n = 2
Each equation represents a line in R2
Planes in 3D, hyper-planes in higher dimensions

a11x + a12y = b1 a21x + a22y = b2

SLIDE 13

How to solve a linear system?

Transform to an equivalent system
Same solution set
Easier to solve
Method known as Gaussian Elimination

2x + y + z = 1 6x + 2y + z = −1 −2x + 2y + z = 7 2x + y + z = 1 − y − 2z = −4 − 4z = −4

SLIDE 14

Elementary operations

Consider the following row operations:
Exchange two rows
Multiply a row by a non-zero constant
Add a multiple of a row into another
Can these operations change the solution set?
Are these operations reversible?

SLIDE 15

Elimination step-by-step

Let each row i be denoted by Ei
Use Ei to eliminate variable i from Ei+1 to En

2x + y + z = 1 6x + 2y + z = −1 −2x + 2y + z = 7 2x + y + z = 1 − y − 2z = −4 (E2 − 3E1) −2x + 2y + z = 7 2x + y + z = 1 − y − 2z = −4 3y + 2z = 8 (E3 + E1) 2x + y + z = 1 − 1y − 2z = −4 3y + 2z = 8 2x + y + z = 1 − 1y − 2z = −4 − 4z = −4 (E3 + 3E2) 2x + y + z = 1 − y − 2z = −4 − 4z = −4

SLIDE 16

Back substitution

Start from triangularized system
Solve for z in E3
Solve for y in E2
Solve for x in E1

2x + y + z = 1 − y − 2z = −4 − 4z = −4 z = 1 y = 4 − 2z = 4 − 2(1) = 2 x = 1 2(1 − y − z) = 1 2(1 − 2 − 1) = −1

SLIDE 17

Simplifying notation

Variable names and operators are redundant
Coefficient matrix A and right-hand side b
Augmented matrix [A|b]
A matrix is a rectangular array of scalars
A scalar is a real or complex number

  2 1 1 1 6 2 1 −1 −2 2 1 7   2x + y + z = 1 6x + 2y + z = −1 −2x + 2y + z = 7

SLIDE 18

Elimination on augmented matrix

  2 1 1 1 6 2 1 −1 −2 2 1 7   2x + y + z = 1 − y − 2z = −4 − 4z = −4 2x + y + z = 1 6x + 2y + z = −1 −2x + 2y + z = 7   2 1 1 1 −1 −2 −4 3 2 8   R2 − 3R1 R3 + R1   2 1 1 1 −1 −2 −4 −4 −4   R3 + 3R2

SLIDE 19

Back-substitution on augmented matrix

x = −1 y = 2 z = 1   2 1 1 1 −1 −2 −4 1 1   R3/(−4)   2 1 −1 −2 1 1   R1 − R3 R2 + 2R3   2 1 1 2 1 1   R2/(−1)   2 −2 1 2 1 1   R1 − R2   1 −1 1 2 1 1   R1/2

SLIDE 20

Origins of method

Chinese book Chiu-chang Suan-shu
Nine chapters on arithmetic
From around 200 B.C.
Using a “counting board”
Row manipulations
Made its way to Japan then Europe
Gauss was a user of method, not inventor

SLIDE 21

How much computation?

Assume an n × n system of equations
Count number of additions/subtractions
Count number of multiplications/divisions
The elimination step is

     a11 a12 · · · a1n b1 a21 a22 · · · a2n b2 . . . . . . ... . . . . . . an1 an2 · · · ann bn           t11 t12 · · · t1n c1 t22 · · · t2n c2 . . . . . . ... . . . . . . · · · tnn cn     

SLIDE 22

Operation count of elimination step

Pivot Elimination Multiplication & Division Addition & Subtraction 1 (n-1)(1+(n-1)+1) (n-1)((n-1)+1) 2 (n-2)(1+(n-2)+1) (n-2)((n-2)+1) … … … n-1 1(1+(1)+1) 1((1)+1)

n−1

X

i=1

i(i + 2)

n−1

X

i=1

i(i + 1)

SLIDE 23

How much computation?

The back-substitution step is
And the solution is given by

     t11 t12 · · · t1n c1 t22 · · · t2n c2 . . . . . . ... . . . . . . · · · tnn cn           1 · · · s1 1 · · · s2 . . . . . . ... . . . . . . · · · 1 sn           x1 x2 . . . xn      =      s1 s2 . . . sn     

SLIDE 24

Operation count of back-substitution

Pivot Elimination Back-substitution Multiplication & Division Addition & Subtraction Multiplication & Division Addition & Subtraction 1 (n-1)(1+(n-1)+1) (n-1)((n-1)+1) 1+(0) 2 (n-2)(1+(n-2)+1) (n-2)((n-2)+1) 1+(1) 1 … … … … … n-1 1(1+(1)+1) 1((1)+1) 1+(n-1) n-1

n−1

X

i=1

i(i + 2)

n−1

X

i=1

i(i + 1)

n

X

i=1

i

n−1

X

i=1

i

SLIDE 25

Total operation count of Gaussian Elimination

Number of multiplication and divisions
Number of additions and subtractions
So about of each operation (n is large!)

n3 3 + n2 − n 3 n3 3 + n2 2 − 5n 6 n3 3

SLIDE 26

Gauss-Jordan method

Try to avoid back-substitution
Modify Gaussian Elimination
Scale Ei so pivot is 1
Annihilate all terms above pivot as well
Intermediate state would look like

       1 c13 · · · c1n d1 1 c23 · · · c2n d2 c33 · · · c3n d3 . . . . . . . . . ... . . . . . . cn3 · · · cnn dn       

SLIDE 27

Total operation count of Gaussian-Jordan method

Number of multiplication and divisions
Number of additions and subtractions
So about of each operation
Worse than Gaussian Elimination! Why?

n3 2 + n2 2 n3 2 − n2 2 n3 2

SLIDE 28

Why do we care about cost?

Are there large linear systems?
Yes, very large systems
Look at two examples
Signal reconstruction
With our Cubic B-Spline!
Two-point boundary problem
Solve ODE numerically through discretization

SLIDE 29

Signal reconstruction

Assume you have sampled a function f at a

range of integer positions

Now you want to reconstruct the value of the

function at all reals in a finite range

Use a compactly supported generating

function and define

f(0), f(1), f(2), . . .

ϕ(x) ˜ f(x) = X

i

f(i)ϕ(x − i)

SLIDE 30

2
1

1 2 0.2 0.4 0.6 0.8 1.0

2
1

1 2 0.2 0.4 0.6 0.8 1.0

2
1

1 2 0.2 0.4 0.6 0.8 1.0

Generating functions

Typical generating functions
Typical reconstructions

nearest/box linear/tent cubic B-Spline

1 2 3 4 5 6 7 1 2 3 4 5 1 2 3 4 5 6 7 1 2 3 4 5 1 2 3 4 5 6 7 1 2 3 4 5

SLIDE 31

B-Spline interpolation

Some generating functions do not interpolate
What are the interpolation constraints?

1 2 3 4 5 6 7 1 2 3 4 5 1 2 3 4 5 6 7 1 2 3 4 5

˜ f(x) = X

j

cjϕ(x − j) ˜ f(x) = X

i

f(i)ϕ(x − i)

˜ f(i) = X

j

cjϕ(i − j) = f(i)

1 2 3 4 5 6 7 1 2 3 4 5

SLIDE 32

Results in large linear system

Digital music is sampled air pressure over time
Typical CD track runs for 5 minutes @ 44KHz
Leads to 44,000 × 60 × 5 = 13,200,000 samples
Each sample leads to one equation
How much memory would we need?
Does sparsity help?
Is there any hope in solving this linear system?
Does matrix structure help?

SLIDE 33

Two-point boundary value

Consider the ordinary differential equation

where and

We want to find y(t), for t in [a, b]
u, v, w, and f are known in [a, b]
It may not be possible to express y in terms of

elementary functions

Use numerical methods

u(t)y00(t) + v(t)y0(t) + w(t)y(t) = f(t) y(a) = α y(b) = β

SLIDE 34

Discretize domain and derivatives

Aproximate y at many discrete positions in [a, b]
Use Taylor series to approximate derivatives
Smaller step size h improves approximation

h t0 = a t1 = a + h t2 = a + 2h tn+1 = b tn = a + nh h h

… …

y0(ti) = y(ti + h) − y(ti − h) 2h + O(h2) y00(ti) = y(ti + h) − 2y(ti) + y(ti − h) h2 + O(h2)

SLIDE 35

Results in large linear system

Discretizing and substituting approximations
We reach

u(t)y00(t) + v(t)y0(t) + w(t)y(t) = f(t) y0(ti) ≈ y(ti + h) − y(ti − h) 2h y00(ti) ≈ y(ti + h) − 2y(ti) + y(ti − h) h2 ui(yi+1 − 2yi + yi−1) 1 h2 + vi(yi+1 − yi−1) 1 2h + wiyi = fi (2ui + hvi) yi+1 + (2h2wi − 4ui) yi + (2ui − hvi) yi−1 = 2h2fi

= yi+1 − yi−1 2h = yi+1 − 2yi + yi−1 h2

SLIDE 36

Machine problem #1

Use Matlab to solve boundary problem
See the effect of the step size on precision
See the effect of dense/sparse matrices
On system solution
On memory
Compare with built-in solver

SLIDE 37

Lecture 2

Number representation by computer
and limitations
Consequences on Gaussian Elimination
and solutions

SLIDE 38

FLOATING-POINT REPRESENTATION

SLIDE 39

Floating-point numbers

All real numbers represented in a computer?
Impossible, of course. Finite memory!
Can only represent a finite set of values
Widely accepted IEEE floating-point standard
Formats, rounding, arithmetic operations
Represented by familiar scientific notation
Except, in binary…

6.02214179 × 1023 −1.602176487 × 10−19

SLIDE 40

Binary floating-point

One sign bit
w exponent bits
t mantissa or fraction or significand bits

s

sign exponent fraction

w t e (−1)s × (1.b-1b-2b-3 · · · b-t)2 × 2e-z m = (1.b-1b-2b-3 · · · b-t)2 z = 2w−1 − 1

SLIDE 41

Representation details

Normalized representation for mantissa m
Ensures unique representation for mantissa
First bit is implicitly set to 1
Get one extra significant bit!
Excess encoding for exponent e
Allows for positive and negative exponents
Therefore large and small magnitudes
Subtract from encoded exponent

12 ≤ m < 102 z = 2w−1 − 1 (−1)s × (1.b-1b-2b-3 · · · b-t)2 × 2e-z z = 2w−1 − 1

SLIDE 42

Special values

Largest representable exponent is reserved
m = 0 represents ±Inf
m ≠ 0 represents NaN (Not-a-Number)
NaN propagates with any operation
ex: NaN + x = NaN
Other special operations

n ÷ ±Inf = ±0 ±0 ÷ ±0 = NaN ±Inf × ±Inf = ±Inf Inf - Inf = NaN n ÷ ±0 = ±Inf ±Inf ÷ ±Inf = NaN Inf + Inf = Inf ±Inf × 0 = NaN

SLIDE 43

Denormalization

What happens when exponent is the smallest?
Normalization causes abrupt underflow
From to 0
Introduce denormalized numbers
When exponent is smallest, no implicit leading 1
PS: Same number of values in each [2i, 2i+1]

1 2 3 4

1.

t

z }| { 00 · · · 1 ×2-z

0.5

SLIDE 44

Summary of representation

sign exponent fraction ± not 0…0 or 1…1 any normalized ± 0…0 not 0…0 denormalized ± 0...0 0…0 zero ± 1…1 0…0 Inf ± 1…1 not 0…0 NaN

SLIDE 45

Typical floating-point formats

Single precision (binary32) Total bits 32 Exponent bits 8 Mantissa bits 23 Exponent range

126…127

Smallest magnitude ≈ 10-45 Decimal range ≈ -1038 to 1038 Decimal precision ≈ 7 ditigs

SLIDE 46

Typical floating-point formats

Single precision (binary32) Double precision (binary64) Total bits 32 64 Exponent bits 8 11 Mantissa bits 23 52 Exponent range

126…127
1022…1023

Smallest magnitude ≈ 10-45 ≈ 10-324 Decimal range ≈ -1038 to 1038

10308 to 10308

Decimal precision ≈ 7 digits ≈ 16 digits

SLIDE 47

Conversion between bases

Integer part
Repeatedly divide by base
Keep remainders
Stop when dividend is zero
Read remainders in reverse order
Fractional part
Repeated multiply by base
Keep integer parts
Stop when maximum precision reached (or zero)
Read integer parts in direct order

SLIDE 48

Conversion to floating-point

Convert -1313.3125 to single precision
Integer part is 1313
Fractional part is .3125
Therefore 1313.3125
Normalization gives 1.01001000010101 × 210
Mantissa is 01001000010101000000000
Exponent is 10+127 = 137 = 100010012
Sign is 1

1 10001001 01001000010101000000000 = C4A42A0016

= 101001000012 = .01012 = 10100100001.01012

SLIDE 49

1 2 3 4 …

Rounding, overflow, and underflow

Let’s try representing 0.1 in floating point
Mantissa is 0.0001100110011001100…
No exact representation possible
Can happen after each arithmetic operation!
Errors can grow and dominate results
This problem often happens in practice

≈ 0 ≈ 2.5 ≈ Inf

SLIDE 50

Effect on addition accuracy

May not be exact even when exponents equal
Ex: 1.1010 + 1.0101 = 1.01111×21 → 1.1000×21
Problem is worse when exponents differ
Must pre-shift before adding
Biggest source of rounding error
Ex: 1.0000 + 1.0000 × 2-5 = 1.00001 → 1.0000
When adding large list of numbers
May need to sort before adding!

SLIDE 51

Multiplication error example

Closest representable number to 0.110 is
0.100000001490116119384765625
Squaring gives (exactly)
0.01000000029802322609739917425031308084

7263336181640625

Squaring in single precision gives (exactly)
0.010000000707805156707763671875
Closest representable number to 0.0110 is
0.009999999776482582092285156250

SLIDE 52

Integers in floating-point

Find largest range of integers that can be

represented exactly in single precision

23 mantissa bits
Add implicit normalization bit
Add sign bit
From -224+1 = -16,777,215 to 224-1
After this range, we could have x+1 = x+2!
32-bit integer has much larger range!
from -231 = -2,147,483,648 to 231 - 1

SLIDE 53

Other weirdness

Associative property does not hold!
(a + b) + c ≠ a + (b + c)
Beware compiler optimizations
Equality operator not meaningful!
Returns true only if exactly equal
Must use special function
Absolute error vs. relative error

SLIDE 54

MAKING GAUSSIAN ELIMINATION WORK IN A COMPUTER

SLIDE 55

Errors during elimination

3-digit elimination

✓47 28 19 89 53 36 ◆ ✓ 47 28 19 fl(89 − 88.8) fl(53 − 52.9) fl(36 − 35.9) ◆ ✓47 28 19 .2 .1 .1 ◆ ✓47 28 19 .1 .1 ◆

fl(89/47) = 1.89 fl(1.89 × 47) = 88.8 fl(1.89 × 28) = 52.9 fl(1.89 × 19) = 35.9

SLIDE 56

Errors during substitution

3-digit back-substitution
Exact solution

✓47 28 19 89 53 36 ◆ ✓47 28 19 −1/47 1/47 ◆ ✓1 1 1 −1 ◆ ✓47 28 19 .1 .1 ◆

fl ✓19 − 28 47 ◆ = fl ✓−9 47 ◆ = −.191 fl (.1/.1) = 1

✓1 −.191 1 1 ◆

SLIDE 57

Common sources of error

Some error is unavoidable in general
How to protect against unnecessary errors?
Try to avoid rounding after addition
Most extreme when exponents are very different
Two strategies against that
Partial pivoting
Scaling

SLIDE 58

Motivation for partial pivoting

Exact arithmetic
3-digit arithmetic

x = 1 1.0001 y = 1.0002 1.0001 ✓−10−4 1 1 104 104 ◆ ✓−10−4 1 1 1 1 2 ◆

fl(2 + 104) = 104 fl(1 + 104) = fl(.10001 × 105) = .100 × 104 = 104

y = 1 ✓−10−4 1 1 10, 001 10, 002 ◆ x = 0

SLIDE 59

Partial pivoting

Search column for coefficient of largest magnitude
Exchange rows to bring it to the pivotal position

✓−10−4 1 1 1 1 2 ◆ ✓ 1 1 2 −10−4 1 1 ◆ ✓1 1 2 1 1 ◆

fl(1 + 10−4) = 1 fl(1 + 2 × 10−4) = 1

x = 1 y = 1

SLIDE 60

Motivation for row scaling

3-digit arithmetic
Choose proper units for problem
Scale rows so max magnitude is 1
Column scaling is typically unnecessary

✓−10 105 105 1 1 2 ◆ ✓−10 105 105 104 104 ◆ y = 1 x = 0

fl(2 + 104) = 104 fl(1 + 104) = 104

SLIDE 61

Complete pivoting

At least as good as partial pivoting
However, not used very frequently. Why?
What is the computational cost?
What the number of memory accesses?

      ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ s ∗ ∗ ∗ s ∗ ∗ ∗ s ∗ ∗ ∗             ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ s s s ∗ s s s ∗ s s s ∗      

SLIDE 62

ILL-CONDITIONED SYSTEMS

SLIDE 63

Not all systems are born equal

Gaussian elimination is great
with partial pivoting
on a properly scaled system
But it is not magic!
Some systems are extraordinarily difficult
No numerical technique can deal with them

SLIDE 64

Ill-conditioned systems

Original system
Perturbed

.835x + .667y = .168 .333x + .266y = .067 y = −1 x = 1 .835ˆ x + .667ˆ y = .168 .333ˆ x + .266ˆ y = .067 − .001 ˆ y = 834 ˆ x = −666

Small perturbation in the system can cause a

relatively large change in the solution!

– What if there are measurement errors? – What about floating-point errors?

“Checking the answer” doesn’t help!

SLIDE 65

What is the impact to engineering?

Both matrix and right-hand side are usually

results of measurements

Even if the exact solution can be obtained,

does it really mean anything?

The problem is with the system, not with the

solution to the system

You are solving the wrong problem!
Find a different experiment
Hope for better conditioning

SLIDE 66

The geometric interpretation

Can we detect this numerically?
Yes, indeed
Need more linear algebra

Well-conditioned Ill-conditioned