Reelle tal, f.eks. 1/7 eller double float 32 bit - - PowerPoint PPT Presentation

reelle tal f eks 1 7 eller double float
SMART_READER_LITE
LIVE PREVIEW

Reelle tal, f.eks. 1/7 eller double float 32 bit - - PowerPoint PPT Presentation

Reelle tal, f.eks. 1/7 eller double float 32 bit 64 bit By default, the x87 processors all use 80-bit double-extended precision internally wikipedia.org/wiki/X87 Algorithms using real numbers


slide-1
SLIDE 1

Reelle tal, f.eks. 1/7 ≈ float eller double

32 bit 64 bit ”By default, the x87 processors all use 80-bit double-extended precision internally“ wikipedia.org/wiki/X87

slide-2
SLIDE 2

Algorithms using “real” numbers

Noter ch.4

slide-3
SLIDE 3

double x = 0.0; if ( x == 0.0) << do something smart >> else << do something stupid >> assert ( x == 0.0 ); // C++ f( ) f( )

P.g.a. underlige compiler optimering og repræsentationer af reelle tal kan der faktisk ske ”something stupid”

slide-4
SLIDE 4

if ( x == 0.0) << do something smart >> else << do something stupid >> assert ( x == 0.0 ); // C++ f( ) f( ) double x = 1.0;

P.g.a. underlige compiler optimering og repræsentationer af reelle tal kan der faktisk ske ”something stupid”

slide-5
SLIDE 5

Algorithms using ”real” numbers

  • Type double contains only finitely many values

double d = 10; for (int i=0; i<10; i++) { System.out.println(d); d = d*d; }

slide-6
SLIDE 6

Algorithms using ”real” numbers

  • Type double contains only finitely many values

double d = .10; for (int i=0; i<10; i++) { System.out.println(d); d = d*d; }

slide-7
SLIDE 7

Algorithms using ”real” numbers

  • Type double contains only finitely many values

System.out.println( (1.0 / 49) * 49 ); }

slide-8
SLIDE 8

Algorithms using “real” numbers

  • Representable numbers
  • Rounding/truncation
  • IEEE standard and Java
  • Summation order
  • Newton iteration
  • More iteration
  • Formula rewriting
slide-9
SLIDE 9

Limited Precision

  • A computer can handle only a finite subset of the reals directly in

hardware

  • These numbers make up a floating point number system F(β,s,m,M)

characterized by

– a base ∈ \1 – a number of digits ∈ – a smallest exponent ∈ – largest exponent ∈

  • Each floating point number has the form

.

  • with and 0 1.
  • The mantissa . … represents

・ ・ ・

  • The exponent part is
  • The system includes 0 0.0 . . . 0
  • All other numbers are normalised: 1 1
slide-10
SLIDE 10

Limited Precision

  • Example: the number system F(2, 3,−2, 1) contains 33 numbers:

In decimal: 0.25 = (1+0/2+0/4) / 4 3.5 = (1+1/2+1/4) * 2

3.5 0.25

slide-11
SLIDE 11

mantissa 110: binary number 1.10 Decimal number 1 + 1/2 + 0/4 = 1.5 mantissa 110 and exponent 1: binary number 11.0 Decimal number 1*2 + 1 + 0/2 = 3

7% 14% 62% 10% 8%

QUIZ

Limited precision a) X b) X c) X d) X e) X A number x in the number system F(2, 3,−2, 1) has the mantissa 110 and the exponent 1. What number is x in the usual decimal base 10 system? 1. 1.5 2. 3 3. 12 4. 11 5. I don’t know

slide-12
SLIDE 12

QUIZ

A number x in the number system F(2, 3,−2, 1) has the mantissa 110 and the exponent 1. What number is x in the usual decimal base 10 system? 1. 1.5 2. 3 3. 12 4. 11 5. I don’t know Limited precision

slide-13
SLIDE 13

QUIZ

A number x in the number system F(2, 3,−2, 1) has the mantissa 110 and the exponent 1. What number is x in the usual decimal base 10 system? 1. 1.5 2. 3 3. 12 4. 11 5. I don’t know Limited precision mantissa 110: binary number 1.10 Decimal number 1 + 1/2 + 0/4 = 1.5 mantissa 110 and exponent 1: binary number 11.0 Decimal number 1*2 + 1 + 0/2 = 3

slide-14
SLIDE 14

Algorithms using “real” numbers

  • Representable numbers
  • Rounding/truncation
  • IEEE standard and Java
  • Summation order
  • Newton iteration
  • More iteration
  • Formula rewriting
slide-15
SLIDE 15

Rounding/truncation

  • Standard demo system is F(10, 4,−99, 99)
  • 1.573 og 0.1824 are representable, but the following are

not 1.573 0.1824 1.7554 1.573 0.1824 1.3906 1.573 0.1824 0.2869152 1.573 / 0.1824 8.6239035 . . .

  • Exact arithmetic on a finite subset of reals is not possible
  • Strategy for rounding/truncating to a representable

number is needed

slide-16
SLIDE 16

Rounding/truncation

  • rounding/truncation defined by function fl : R → M, where M

is machine numbers

  • machine arithmetic operations ⊕,⊖,⊗,⊘ defined by

⊕ etc.

  • In demo system F(10, 4,−99, 99) define by truncation,

e.g.

1.573 ⊕ 0.1824 1.573 0.1824 1.7554 1.755 1.573 ⊖ 0.1824 1.573 0.1824 1.3906 1.390 1.573 ⊗ 0.1824 1.573 0.1824 0.2869152 .2869 1.573 ⊘ 0.1824 1.573 / 0.1824 8.6239035 . . . 8.623

slide-17
SLIDE 17

Rounding/truncation

  • Algebraic laws invalid:

1.418 ⊕ 2937 ⊖ 2936 2938 ⊖ 2936 2.000 1.418 ⊕ 2937 ⊖ 2936 1.418 ⊕ 1.000 2.418 1.418 ⊗ 2001 ⊖ 2000 1.418 ⊗ 1.000 1.418 1.418 ⊗ 2001 ⊖ 1.418 ⊗ 2000 2837 ⊖ 2836 1.000

  • The usual associative and distributive laws are not valid

for machine arithmetic. Sometimes: ⊕ ⊖ ⊕ ⊖ ⊗ ⊖ ⊗ ⊖ ⊗

slide-18
SLIDE 18

QUIZ

What is the result of computing (0.9996 ⊕ 0.9998) ⊘ 2 in the number system F(10, 4,−99, 99) using truncation? 1. 0.9995 2. 0.9996 3. 0.9997 4. 0.9998 5. I don’t know Rounding / truncation

slide-19
SLIDE 19

QUIZ

What is the result of computing (0.9996 ⊕ 0.9998) ⊘ 2 in the number system F(10, 4,−99, 99) using truncation? 1. 0.9995 2. 0.9996 3. 0.9997 4. 0.9998 5. I don’t know Rounding / truncation (0.9996 ⊕ 0.9998) ⊘ 2 = fl(0.9996 + 0.9998) ⊘ 2 = fl(1.9994) ⊘ 2 = 1.999 ⊘ 2 = fl(1.999 / 2) = fl(0.9995) = 0.9995 Better expression for computing average: 0.9996 ⊕ (0.9998 ⊖ 0.9996) ⊘ 2 = 0.9997

slide-20
SLIDE 20

3% 8% 8% 5% 77%

a) X b) X c) X d) X e) X

QUIZ

What is the result of computing (0.9996 ⊕ 0.9998) ⊘ 2 in the number system F(10, 4,−99, 99) using truncation? 1. 0.9995 2. 0.9996 3. 0.9997 4. 0.9998 5. I don’t know Rounding / truncation (0.9996 ⊕ 0.9998) ⊘ 2 = fl(0.9996 + 0.9998) ⊘ 2 = fl(1.9994) ⊘ 2 = 1.999 ⊘ 2 = fl(1.999 / 2) = fl(0.9995) = 0.9995 Better expression for computing average: 0.9996 ⊕ (0.9998 ⊖ 0.9996) ⊘ 2 = 0.9997

slide-21
SLIDE 21

QUIZ

It is a fact that lim

→1

  • .

The algorithm computes an approximation to corresponding to 2 1024. What is the result of execution in the number system F(10,4,−99,99) with truncation, where fl() = 2.718?

  • 1. 0
  • 2. 1.000
  • 3. 2.591
  • 4. 2.718
  • 5. I don’t know

Rounding / truncation int k = 10; m = 1; for (int i=0; i<k; i++) m = 2*m; s = 1 + 1/m; for (int i=0; i<k; i++) s = s*s; print s; m,s ∊ F(10,4,−99,99)

slide-22
SLIDE 22

QUIZ

It is a fact that lim

→1

  • .

The algorithm computes an approximation to corresponding to 2 1024. What is the result of execution in the number system F(10,4,−99,99) with truncation, where fl() = 2.718?

  • 1. 0
  • 2. 1.000
  • 3. 2.591
  • 4. 2.718
  • 5. I don’t know

Rounding / truncation 2 2 2 4 … 2 512 2 1024 1 1 1⊕1⊘1024 1⊕0.0009765 . ∗ . … ∗ . int k = 10; m = 1; for (int i=0; i<k; i++) m = 2*m; s = 1 + 1/m; for (int i=0; i<k; i++) s = s*s; print s; m,s ∊ F(10,4,−99,99)

slide-23
SLIDE 23

2% 2% 86% 3% 7%

a) X b) X c) X d) X e) X

QUIZ

It is a fact that lim

→1

  • .

The algorithm computes an approximation to corresponding to 2 1024. What is the result of execution in the number system F(10,4,−99,99) with truncation, where fl() = 2.718?

  • 1. 0
  • 2. 1.000
  • 3. 2.591
  • 4. 2.718
  • 5. I don’t know

Rounding / truncation 2 2 2 4 … 2 512 2 1024 1 1 1⊕1⊘1024 1⊕0.0009765 . ∗ . … ∗ . int k = 10; m = 1; for (int i=0; i<k; i++) m = 2*m; s = 1 + 1/m; for (int i=0; i<k; i++) s = s*s; print s; m,s ∊ F(10,4,−99,99)

slide-24
SLIDE 24

Algorithms using “real” numbers

  • Representable numbers
  • Rounding/truncation
  • IEEE standard and Java
  • Summation order
  • Newton iteration
  • More iteration
  • Formula rewriting
slide-25
SLIDE 25

IEEE standard and Java

  • IEEE standard describes two number systems,

approximately

– Single precision F(2, 24,−126, 127) – Double precision F(2, 53,−1022, 1023)

  • The IEEE systems has more numbers:

– closes the representational gap around zero with xtra numbers of the form 0. . . . ∗ 2 – has representations for ∞ and NaN (= Not a Number)

  • The IEEE system has detailed rules for rounding to

representable numbers, ex.

– overflow: 1/0 or 2∗ 2 has result ∞ – underflow: 1/∞ or 2/2 has result 0 – 0/0 or ∞ ∞ has result NaN

slide-26
SLIDE 26

IEEE standard and Java

  • Java implementations use IEEE standard for the primitive

types float and double on most computers

  • Use modifier strictfp to enforce IEEE standard

independent of the underlying hardware.

  • Representable values in type double are

– . . . . ∗ 2, where ∈ 0, 1 and 1022 1023 – ∞ and NaN (Double.NEGATIVE_INFINITY, Double.POSITIVE_INFINITY , Double.NaN)

  • Warning: legal java double literals may not be directly

representable: literal 0.2 is represented as

– 0.2 0.00110011 . . .

1.1001100110011001100110011001100110011001100110011010 ・2

slide-27
SLIDE 27

QUIZ

What is written?

  • 1. 2E1000
  • 2. 1.0715086071862673E301
  • 3. Double.POSITIVE_INFINITY
  • 4. Double.NaN
  • 5. I don’t know

Type double public strictfp void test() { double a = 1; for (int i = 0; i< 1000; i++) a *= 2; System.out.println(a); } double is approximately F(2,53,−1022,1023)

slide-28
SLIDE 28

QUIZ

What is written?

  • 1. 2E1000
  • 2. 1.0715086071862673E301
  • 3. Double.POSITIVE_INFINITY
  • 4. Double.NaN
  • 5. I don’t know

Type double 2E1000 denotes 2 ∗ 10 2 1.0715086071862673E301 denotes 1.0715086071862673 ∗ 10 2 Double.POSITIVE_INFINITY denotes value larger than approximately 2 public strictfp void test() { double a = 1; for (int i = 0; i< 1000; i++) a *= 2; System.out.println(a); } double is approximately F(2,53,−1022,1023)

slide-29
SLIDE 29

3% 28% 26% 2% 42%

a) X b) X c) X d) X e) X

QUIZ

What is written?

  • 1. 2E1000
  • 2. 1.0715086071862673E301
  • 3. Double.POSITIVE_INFINITY
  • 4. Double.NaN
  • 5. I don’t know

Type double 2E1000 denotes 2 ∗ 10 2 1.0715086071862673E301 denotes 1.0715086071862673 ∗ 10 2 Double.POSITIVE_INFINITY denotes value larger than approximately 2 public strictfp void test() { double a = 1; for (int i = 0; i< 1000; i++) a *= 2; System.out.println(a); } double is approximately F(2,53,−1022,1023)

slide-30
SLIDE 30

Algorithms using “real” numbers

  • Representable numbers
  • Rounding/truncation
  • IEEE standard and Java
  • Summation order
  • Newton iteration
  • More iteration
  • Formula rewriting
slide-31
SLIDE 31

n‐th Harmonic number Hn = 1/1 + 1/2 + 1/3 +∙∙∙+ 1/n = 1/i

n i = 1

Σ

1/n n 1 2 3 4 5 1/1 1/2 1/3 1/4 1/n

1/x dx = [ ln x ] = ln n – ln 1 = ln n

n 1 n 1

Hn – 1   Hn – 1/n ln n + 1/n  Hn  ln n + 1

slide-32
SLIDE 32

Summation

  • The nth harmonic number is defined by

1 1 1 2 1 3 ・ ・ ・ 1

  • Associative law is not valid
  • Which order of summation is best?
  • experiment in toy system R(10, 4,−99, 99) with truncation
slide-33
SLIDE 33

Summation

  • try forward summation:

1 1⊕ 1 2 ⊕ 1 3 ⊕・ ・ ・⊕ 1

  • The computed sum is constant

for n ≥ 1000 since 7.069⊕ 1 1001 7.069⊕0.0009990 7.069

slide-34
SLIDE 34

Summation

  • try backward summation:

1 1⊕ 1 2 ⊕ 1 3 ⊕・ ・ ・⊕ 1

  • Error is slightly reduced
slide-35
SLIDE 35

Summation

  • add terms pairwise in a tree structure:

1 1⊕ 1 2 ⊕ 1 3 ⊕ 1 4 ⊕・ ・ ・⊕ 1

  • Add numbers of (almost) equal size
slide-36
SLIDE 36

Summation

  • add terms pairwise in a tree structure:

1 1⊕ 1 2 ⊕ 1 3 ⊕ 1 4 ⊕・ ・ ・⊕ 1

  • By adding equal sized terms the error is reduced substantially!
slide-37
SLIDE 37

Summation

  • similar experiment for java double, F(2, 53,−1022, 1023):
slide-38
SLIDE 38

Summation

  • Problem

– compute ・ ・ ・

  • Warning:

– ⊕ not associative

  • Advice:

– Add terms in increasing order – Try to add terms of equal size only

slide-39
SLIDE 39

QUIZ

We want to compute

  • 1
  • in F(10, 4,−99, 99). Which order of summation do we expect to provide the best

approximation?

  • 1. All orders of summation lead to the same result
  • 2. We add numbers in decreasing order ∑
  • 1⊕
  • ⊕・ ・ ・⊕
  • 3. We add numbers in increasing order ∑
  • 1⊕
  • ⊕・ ・ ・⊕
  • 4. We use an algorithm that repeatedly replaces the two smallest numbers with

their sum

  • 5. I don’t know

Summation

slide-40
SLIDE 40

QUIZ

We want to compute

  • 1
  • in F(10, 4,−99, 99). Which order of summation do we expect to provide the best

approximation?

  • 1. All orders of summation lead to the same result
  • 2. We add numbers in decreasing order ∑
  • 1⊕
  • ⊕・ ・ ・⊕
  • 3. We add numbers in increasing order ∑
  • 1⊕
  • ⊕・ ・ ・⊕
  • 4. We use an algorithm that repeatedly replaces the two smallest numbers with

their sum

  • 5. I don’t know

Summation

slide-41
SLIDE 41

16% 5% 4% 73% 2%

a) X b) X c) X d) X e) X

QUIZ

We want to compute

  • 1
  • in F(10, 4,−99, 99). Which order of summation do we expect to provide the best

approximation?

  • 1. All orders of summation lead to the same result
  • 2. We add numbers in decreasing order ∑
  • 1⊕
  • ⊕・ ・ ・⊕
  • 3. We add numbers in increasing order ∑
  • 1⊕
  • ⊕・ ・ ・⊕
  • 4. We use an algorithm that repeatedly replaces the two smallest numbers with

their sum

  • 5. I don’t know

Summation

slide-42
SLIDE 42

Algorithms using “real” numbers

  • Representable numbers
  • Rounding/truncation
  • IEEE standard and Java
  • Summation order
  • Newton iteration
  • More iteration
  • Formula rewriting
slide-43
SLIDE 43

Newton iteration

  • Find root of polynomial f(x)
  • Compute z lim

for ′

slide-44
SLIDE 44

Newton iteration

  • Example:

is root of

  • Newton iteration:
  • Using
  • 1
  • Compute approximation to z lim

slide-45
SLIDE 45

Newton iteration

  • Example:

is root of

  • Newton iteration:
  • ,
  • 1
  • When should iteration stop?
  • When is approximation good enough?

public static double sqrt(double a) { double xnew = (a + 1)/2; double xold = xnew + 1; while (???) { xold = xnew; xnew = xold − (xold*xold − a)/2/xold; } return xold; }

slide-46
SLIDE 46

QUIZ

What is proper stopping criteria for square root computation by Newton iteration?

  • 1. xold == xnew
  • 2. |xold - xnew| (for some small )
  • 3. |xold - xnew| ∗ xold (for some small )
  • 4. k == 100
  • 5. I don’t know

Stopping criteria public static double sqrt(double a) { double xnew = (a + 1)/2; double xold = xnew + 1; while (???) { xold = xnew; xnew = xold − (xold*xold − a)/2/xold; } return xold; } (since for some (Since we are satisfied with a small absolute error | | ) (Since we are satisfied with a small relative error | | ) (since we cannot really trust any of the other criteria)

slide-47
SLIDE 47

QUIZ

What is proper stopping criteria for square root computation by Newton iteration?

  • 1. xold == xnew
  • 2. |xold - xnew| (for some small )
  • 3. |xold - xnew| ∗ xold (for some small )
  • 4. k == 100
  • 5. I don’t know

Stopping criteria public static double sqrt(double a) { double xnew = (a + 1)/2; double xold = xnew + 1; while (???) { xold = xnew; xnew = xold − (xold*xold − a)/2/xold; } return xold; } (since for some (Since we are satisfied with a small absolute error | | ) (Since we are satisfied with a small relative error | | ) (since we cannot really trust any of the other criteria)

slide-48
SLIDE 48

3% 33% 26% 24% 14%

a) X b) X c) X d) X e) X

QUIZ

What is proper stopping criteria for square root computation by Newton iteration?

  • 1. xold == xnew
  • 2. |xold - xnew| (for some small )
  • 3. |xold - xnew| ∗ xold (for some small )
  • 4. k == 100
  • 5. I don’t know

Stopping criteria public static double sqrt(double a) { double xnew = (a + 1)/2; double xold = xnew + 1; while (???) { xold = xnew; xnew = xold − (xold*xold − a)/2/xold; } return xold; } (since for some (Since we are satisfied with a small absolute error | | ) (Since we are satisfied with a small relative error | | ) (since we cannot really trust any of the other criteria)

slide-49
SLIDE 49

Stopping criteria

  • Try stopping criteria
  • Never ask whether 2 double values are identical!

voksende !

slide-50
SLIDE 50

Stopping criteria

  • In , , , the optimal precision is
  • Try stopping criteria

– Same as ¬ 0 while (xold – xnew > 0)

slide-51
SLIDE 51

Algorithms using “real” numbers

  • Representable numbers
  • Rounding/truncation
  • IEEE standard and Java
  • Summation order
  • Newton iteration
  • More iteration
  • Formula rewriting
slide-52
SLIDE 52

More iteration

  • Find root of equation

3 8 10 0

  • Rewrite equation

1 8 10 3

  • Find x by iterating towards fixed point

1

8 10 3

Using x 0 Gives convergence: x x ⋯ ⋯ x x

slide-53
SLIDE 53

More iteration

  • Try stopping criteria

| |

  • Infinite loop!
  • In 10,4, 99,99 we

never get closer than | | 3 ∗

  • Select stopping criteria

with care!

slide-54
SLIDE 54

Stopping criteria

  • Problem

– Compute lim

→ for

  • Warning:

– Using limited precision arithmetic, the series may fail to converge – Naive stopping criteria may lead to looping forever!

  • Advice:

– stop iteration when adjacent terms are sufficiently close – the meaning of “sufficiently close” depends on the concrete series – for monotone convergent series it may be ok to loop until strict monotonicity fails