This is a continuation of this article.
The sign, exponent, and mantissa when 0.1
is expressed as a floating point number are as follows.
Code= 0
index= 01111111011
mantissa= 1001100110011001100110011001100110011001100110011010
The exponent is 1019
in decimal. The floating-point exponent has 1023
added to it, so subtracting this gives -4
.
The table below shows the calculation results when 0.1
is added to the odd rows excluding the header, and the differences between the even rows.
Expected value | Calculation result / difference |
---|---|
0.1 | 0.1000000000000000055511151231257827021181583404541015625 |
0.1000000000000000055511151231257827021181583404541015625 |
|
0.2 | 0.200000000000000011102230246251565404236316680908203125 |
0.100000000000000033306690738754696212708950042724609375 |
|
0.3 | 0.3000000000000000444089209850062616169452667236328125 |
0.09999999999999997779553950749686919152736663818359375 |
|
0.4 | 0.40000000000000002220446049250313080847263336181640625 |
0.09999999999999997779553950749686919152736663818359375 |
|
0.5 | 0.5 |
0.09999999999999997779553950749686919152736663818359375 |
|
0.6 | 0.59999999999999997779553950749686919152736663818359375 |
0.09999999999999997779553950749686919152736663818359375 |
|
0.7 | 0.6999999999999999555910790149937383830547332763671875 |
0.09999999999999997779553950749686919152736663818359375 |
|
0.8 | 0.79999999999999993338661852249060757458209991455078125 |
0.09999999999999997779553950749686919152736663818359375 |
|
0.9 | 0.899999999999999911182158029987476766109466552734375 |
0.09999999999999997779553950749686919152736663818359375 |
|
1.0 | 0.99999999999999988897769753748434595763683319091796875 |
Addition of positive floating point numbers is done by the following procedure.
① Match the smaller index to the larger index (2) Adjust by reducing the mantissa as the index is increased. ③ Add the mantissa ④ If a carry occurs, add 1 to the index and reduce the mantissa by that amount. ⑤ Round the overflowing digits of the mantissa (even rounding)
Guard digits are the area to store the overflowing digits. The Sticky bit is a boolean value that indicates whether or not the number below the protection digit contains a number greater than or equal to 1.
In the first part, it was mentioned that one protective digit is sufficient for adding positive numbers, but two digits are required when expanding to negative numbers. The following uses two protection digits and a sticky bit. The two digits (2 bits) of the protection digit are named Guard bit and Round bit from the top.
Check the calculation process when adding 0.1
step by step.
This time, I will explain using the following figure.
number | Column name | Description |
---|---|---|
① | Approximate | It roughly describes what kind of calculation is being performed. |
② | index(Decimal number) | 浮動小数点数で表されたindexを、10進数に変換し、1023を引いたものです。 |
③ | Move up | As a result of the addition, the digits may be carried up, so that much area is secured. |
④ | Integer part | 浮動小数点形式ではInteger partは常に1 is. |
⑤ | 0 to 52 | It is a mantissa. Since it is difficult to see if all 52 digits are displayed, only the upper 8 digits and the lower 8 digits are displayed. |
⑥ | Guard round sticky | The protection girder and sticky bit described above. |
Check the calculation process of 0.1 + 0.1
.
① Match the smaller index to the larger index (2) Adjust by reducing the mantissa as the index is increased. Since the indexes are the same, nothing is done.
③ Add the mantissa It is the result of adding
④ If a carry occurs, add 1 to the index and reduce the mantissa by that amount.
Since there is a carry, add 1
to the exponent and divide the mantissa by 2
. Since it is a binary number, dividing by 2 shifts the digit by one to the right (bit shift).
⑤ Round the overflowing digits of the mantissa (even rounding)
The only overflowing bit is 0
, so truncate it.
Above,
index= -3 (Decimal number)
mantissa= 1001100110011001100110011001100110011001100110011010 (Binary number)
Was obtained. If you convert this to a decimal number,
0.200000000000000011102230246251565404236316680908203125
is.
Add 0.1
to the value obtained above.
① Match the smaller index to the larger index
(2) Adjust by reducing the mantissa as the index is increased.
The exponent of 0.2
is 1
larger, so add 1
to the exponent of 0.1
and shift the mantissa by 1
to the right.
③ Add the mantissa It is the result of adding up.
④ If a carry occurs, add 1 to the index and reduce the mantissa by that amount.
Since there is a carry, add 1
to the exponent and shift the mantissa to the right 1
.
⑤ Round the overflowing digits of the mantissa (even rounding)
The overflowing digit is 10
, which is exactly half, so round it in even numbers. This time, the least significant bit of the mantissa is 1
, so round it up.
Above,
index= -2 (Decimal number)
mantissa= 11001100110011001100110011001100110011001100110100 (Binary number)
Was obtained. If you convert this to a decimal number,
0.3000000000000000444089209850062616169452667236328125
is.
In 0.2 (cumulative sum) + 0.1
, 0.1
is originally larger than 0.1
in floating point number, and the added result is rounded up further, so print
You will see results that are not what you expected when you did.
print(0.1 + 0.1 + 0.1)
# => 0.30000000000000004
Add 0.1
to the value obtained above.
① Match the smaller index to the larger index
(2) Adjust by reducing the mantissa as the index is increased.
Since the exponent of 0.3
is 2
larger, add 2
to the exponent of 0.1
and shift the mantissa by 2
to the right.
③ Add the mantissa It is the result of adding up.
④ If a carry occurs, add 1 to the index and reduce the mantissa by that amount. No carry up, so do nothing
⑤ Round the overflowing digits of the mantissa (even rounding)
The overflowing digit is 10
, which is exactly half, so round it in even numbers. This time, the least significant bit of the mantissa is 0, so it is truncated.
Above,
index= -2 (Decimal number)
mantissa= 1001100110011001100110011001100110011001100110011010 (Binary number)
Was obtained. If you convert this to a decimal number,
0.40000000000000002220446049250313080847263336181640625
is.
Add 0.1
to the value obtained above.
① Match the smaller index to the larger index
(2) Adjust by reducing the mantissa as the index is increased.
Since the exponent of 0.4
is 2
larger, add 2
to the exponent of 0.1
and shift the mantissa by 2
to the right.
③ Add the mantissa
It is the result of adding up. All values in the mantissa are now 0
.
④ If a carry occurs, add 1 to the index and reduce the mantissa by that amount.
Since there is a carry, add 1
to the exponent and shift the mantissa by 1
to the right.
⑤ Round the overflowing digits of the mantissa (even rounding)
The overflowing digit is 010
, so it will be truncated.
Above,
index= -1 (Decimal number)
mantissa= 0000000000000000000000000000000000000000000000000000 (Binary number)
Was obtained. Converting this to a decimal number,
0.5
is. It was a coincidence that it became perfect.
Add 0.1
to the value obtained above.
① Match the smaller index to the larger index
(2) Adjust by reducing the mantissa as the index is increased.
Since the exponent of 0.5
is 3
larger, add 3
to the exponent of 0.1
and shift the mantissa by 3
to the right.
③ Add the mantissa It is the result of adding up.
④ If a carry occurs, add 1 to the index and reduce the mantissa by that amount. No carry up, so do nothing
⑤ Round the overflowing digits of the mantissa (even rounding)
The overflowing digit is 010
, so it will be truncated.
Above,
index= -1 (Decimal number)
mantissa= 11001100110011001100110011001100110011001100110011 (Binary number)
Was obtained. Converting this to a decimal number,
0.59999999999999997779553950749686919152736663818359375
is.
Since 0.6
to 0.8
are almost the same as 0.5 (cumulative sum) + 0.1
, they are omitted. Values less than 1
are added in this range due to value truncation.
This is almost the same as 0.5 (cumulative sum) + 0.1
, but let's see what the result looks like.
① Match the smaller index to the larger index
(2) Adjust by reducing the mantissa as the index is increased.
Since the exponent of 0.9
is 3
larger, add 3
to the exponent of 0.1
and shift the mantissa by 3
to the right.
③ Add the mantissa It is the result of adding up.
④ If a carry occurs, add 1 to the index and reduce the mantissa by that amount. No carry up, so do nothing
⑤ Round the overflowing digits of the mantissa (even rounding)
The overflowing digit is 010
, so it will be truncated.
Above,
index= -1 (Decimal number)
mantissa= 1111111111111111111111111111111111111111111111111111 (Binary number)
Was obtained. Converting this to a decimal number,
0.99999999999999988897769753748434595763683319091796875
is.
Up to this point, we have confirmed the operation when 0.1
is added 10 times.
The answer is, why is 0.1
slightly larger than 0.1
when converted to a floating point number, but less than 1.0
when added 10 times?
That's because when you add two floating point numbers, you round the numbers that don't fit in the mantissa.
-About floating point arithmetic
-Floating Point Internal Representation Simulator -Bin, octa, decimal, hexadecimal reciprocal conversion tool
Recommended Posts