Professional Documents
Culture Documents
Fractional Numbers
Fractional numbers have the form:
Fractional Number Notations xxxxxxxxx.yyyyyyyyy
where the xes constitute the integer
part of the value and the ys the
Ver. 1.4 fractional part
There are two main methods to encode
fractional numbers:
fixed-point notation
2010 - Claudio Fornaro
floating-point notation
3 4
E.g. 1+7+8 means 1 bit for sign, 7 for the Note: when using the 1st 2C-operation method,
integer part and 8 for the fractional part 1 must be added to the LSB, not to unity place:
01100010
Operations are the same seen as for 10011101+
2C-Operation
7 8
Exercises Exercises
Convert the values as requested Solutions
151.0 FX 2C on 16 bits (1+8+7) 151.0 0 10010111 0000000
151.25 FX 2C on 16 bits (1+8+7) 1 01101001 0000000
111100101010 from FX 2C (1+7+4) ()10 151.25 0 10010111 0100000
100110011000 from FX 2C (1+6+5) ()10 1 01101000 1100000
Note that the integer part is not the same
Calculate on FX 2C 16 bits (1+7+8) and
111100101010 0 0001101 0110
identify any overflow
13.37510
(111.6 44.57) / 2
100110011000 0 110011 01000
(68.22 71.25) * 64 51.2510
9 10
Exercises Exercises
Solutions Solutions
(111.6 44.57) / 2 (68.22 71.25) * 64
1101111.10011001201101111100110012C 1000100.00111000201000100001110002C
101100.100100012 00101100100100012C 1000111.012 01000111010000002C
11010011011011112C 10111000110000002C
1 0110111110011001+ 1 0100010000111000+
1101001101101111= 1011100011000000=
0100001100001000 0010000110000100 1111110011111000 0011111000000000
+33.515625 OVERFLOW
Radix points are supposed here Radix points are supposed here
11 12
15 16
19 20
23 24
27 28
31 32
35 36
39 40
43 44
Underflow Underflow
If the conversion of the smaller value Example in SP
shifts away all of the mantissa bits 1.101243+ 1.01218
(including the hidden bit), the value is 1.01218 must be converted to the form
xxx243, this causes a right shift of 25
approximated to 0, thus the operation
bits on the mantissa, thus shifting away
result is equal to the greater while the all the 24 mantissa bits and resulting in 0
smaller is just ignored Adding up many small values, it is
There is an underflow condition when, possible that a partial sum becomes so
adding 2 values, the result is equal to big to cause underflow for each of the
the greater of them subsequent values (only the first part of
the values is added up)
45 46
47 48
51 52
55 56
59
IEEE-P754 Puzzles
Determine the (absolute) representation
error for value N=61018 in IEEE-P754 SP.
N = 6 1018 6 260 requires 63 bits
N = 1.xxx 262
In SP there are only 23 bits for the mantissa
The relative weight of the LSB is 262-23=39
The representation error is 239