Adapted from Computer Organization and Design, Patterson & Hennessy, UCB
ECE232: Hardware Organization and Design
Lecture 9: Floating Point
ECE232: Hardware Organization and Design Lecture 9: Floating Point - - PowerPoint PPT Presentation
ECE232: Hardware Organization and Design Lecture 9: Floating Point Adapted from Computer Organization and Design , Patterson & Hennessy, UCB Floating Point Representation for non-integral numbers Including very small and very large
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB
Lecture 9: Floating Point
ECE232: Floating Point 2
normalized
ECE232: Floating Point 3
1111 1111 1111 1111 1111 1111 1111 1111 = 4,294,967,295
4,600,000,000 or 4.6 x 109
0.0000000000000000000000000166 or 1.6 x 10-27
bit integer.
ECE232: Floating Point 4
The representations differ in that the decimal place – the “point” - “floats” to the left or right (with the appropriate adjustment in the exponent).
ECE232: Floating Point 5
Sign of mantissa Location of decimal point Mantissa Exponent Sign of exponent Base
Mantissa is also called Significand
ECE232: Floating Point 6
32 bits M: Mantissa (23 bits) E: Exponent (8 bits) S: Sign of mantissa (1 bit)
ECE232: Floating Point 7
ECE232: Floating Point 8
represent it explicitly (hidden bit)
S Exponent Fraction
single: 8 bits double: 11 bits single: 23 bits double: 52 bits
Bias) (Exponent S
ECE232: Floating Point 9
actual exponent = 1 – 127 = –126
actual exponent = 254 – 127 = +127
ECE232: Floating Point 10
ECE232: Floating Point 11
11000000101000…00
= (–1) × 1.25 × 22 = –5.0
ECE232: Floating Point 12
ECE232: Floating Point 13
ECE232: Floating Point 14
d=|E1 - E2|
the right
to exponent of larger operand
if necessary
necessary
Source: I. Koren, Computer Arithmetic Algorithms, 2nd Edition, 2002
ECE232: Floating Point 15
0 10000010 11010000000000000000000 1.11012 130 – 127 = 3 0 = positive mantissa
ECE232: Floating Point 16
456.7810 = 4 x 102 + 5 x 101 + 6 x 100 + 7 x 10-1+8 x 10-2 1011.112 = 1 x 23 + 0 x 22 + 1 x 21 + 1 x 20 + 1 x 2-1 + 1 x 2-2 = 8 + 0 + 2 + 1 + 1/2 + ¼ = 11 + 0.5 + 0.25 = 11.7510
ECE232: Floating Point 17
1 1 1 3 6 1 13 1 27 1 55 1 111 1 223 446 892 1 1785 1 3571 7142 1 14285 1 28571 1 57143
ECE232: Floating Point 18
0.154 1 0.308 2 0.616 3 1.232 1 4 0.464 5 0.928 6 1.856 1 7 1.712 1 8 1.424 1 9 0.848 10 1.696 1 11 1.392 1 12 0.784 13 1.568 1 14 1.136 1 15 0.272 16 0.544 17 1.088 1 18 0.176 19 0.352 20 0.704 21 1.408 1 22 0.816 23 1.632 1
Decimal 0.154 = .0010 0111 0110 1100 1000 101 Successive multiplication by 2
ECE232: Floating Point 19
127
E S
used for program diagnostics
F+=, F/=0
Source: I. Koren, Computer Arithmetic Algorithms, 2nd Edition, 2002
ECE232: Floating Point 20
Single Precision Double Precision Object represented Exponent Fraction Exponent Fraction nonzero nonzero ± denormalized number 1-254 Anything 1-2046 Anything ± floating point number 255 2047 ± infinity 255 nonzero 2047 nonzero NaN (not a number) 127
E S
1 E 254
ECE232: Floating Point 21
normalized numbers (represented by 1 in the Exp field and 0…0 in the Fraction field) are
denormalized numbers (represented by all 0s in the Exp field and 0…01 in the Fraction field) are
(represented by 254 in the Exp field and 1…1 in the Fraction field) are
ECE232: Floating Point 22
Step 1 Step 2 Step 3 Step 4
ECE232: Floating Point 23
NaN 010 0000 0000 0000 0000 0000 1111 1111 NaN Infinity 000 0000 0000 0000 0000 0000 1111 1111 Infinity 1.18×10-38 000 0000 0000 0000 0000 0000 0000 0001 Smallest normalized number 3.4×1038 111 1111 1111 1111 1111 1111 1111 1110 Largest normalized number 5.9×10-39 100 0000 0000 0000 0000 0000 0000 0000 Denormalized number 1 000 0000 0000 0000 0000 0000 0111 1111 One 000 0000 0000 0000 0000 0000 0000 0000 Zero Value Mantissa Exponent Type
ECE232: Floating Point 24