100-99.99-0.01 in a spreadsheet



#7

All --

Digging through some old threads, I found one from late 2002 in which a contributor pointed out that the formula 100.00-99.99-0.01 would equal not zero, but 5.1156995306556 x 10-15 in a Microsoft Excel spreadsheet cell. Excel 2003 and Quattro Pro 12 (2004) return this result, which is due to roundoff error in floating-point calculations, as the first responder stated. In this case, IEEE 64-bit double-precision values are apparently used.

The thread:

http://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/archv009.cgi?read=24977

The example illustrates an advantage of binary-coded decimal (BCD) -- exactitude eliminates the need for rounding and tolerance-checking. It's probably also why the HP-30S, which uses 80-bit floating-point representations, performs rounding to achieve "pleasing" results, which are not always strictly correct.

Just as an exercise of curiosity, here are the computational steps in the floating-point calculation of 100.00-99.99-0.01:

-- KS


100.00 - 99.99 - 0.01 = 5.1156995306556 x 10-15 using spreadsheet software

Conversions: IEEE 64-bit double-precision format at

http://babbage.cs.qc.edu/IEEE-754/Decimal.html

number offset exponent value of mantissa

99.99 10000000101 (+6) 1.1000111111110101110000101000111101011100001010001111
0.01 01111111000 (-7) 1.0100011110101110000101000111101011100001010001111011

100.00 10000000101 (+6) 1.1001000000000000000000000000000000000000000000000000

number base-2 representation with aligned radix point

99.99 1100011.1111110101110000101000111101011100001010001111

0.01 0.00000010100011110101110000101000111101011100001010001111011

100.00 1100100.0000000000000000000000000000000000000000000000


Now, the subtractions:

100.00 1100100.0000000000000000000000000000000000000000000000
-99.99 1100011.1111110101110000101000111101011100001010001111

=
1100011.1111111111111111111111111111111111111111111111
- 1100011.1111110101110000101000111101011100001010001111
------------------------------------------------------
0000000.0000001010001111010111000010100011110101110000
+ 0000000.0000000000000000000000000000000000000000000001

= 0.0000001010001111010111000010100011110101110001

= 1.0100011110101110000101000111101011100010000000------ x 2-07
-0.01 1.0100011110101110000101000111101011100001010001111011 x 2-07

= 0.0000000000000000000000000000000000000000101110000101 x 2-07

= 2949 X 2-59

= 5.1156995306556 x 10-15


Edited: 10 Feb 2007, 5:01 p.m. after one or more responses were posted


#8

here's my excel example,

=IF(4.6-.2-.2-.2-4=0,"YES","NO")

unfortunately, this is the case for IEEE-754 binary. you always get the same answer 4.6-.2-.2-.4-4 = -8.8817841970013e-016

it's one reason why i've changed hplua to now use an underlying decimal system.


#9

Hi, Hugh --

That's an even more curious example! While 100 and 0.01 are 13 orders of base-2 magnitude apart, 4.6 and 0.2 are only 5 orders of base-2 magnitude apart. Of course, there are four subtractions involved in total, instead of two.

MS Excel 2003 computes zero as the result of the formula "=4.6-.2-.2-.2-4.0", but not if the expression is within parentheses, such as

=IF(4.6-.2-.2-.2-4=0,"YES","NO")

=(4.6-.2-.2-.2-4)


In such cases, the -8.8817841970013e-016 is returned.

How can ya trust that?

Quattro Pro seems to be a bit more consistent with results in that respect. However, I never warmed to using spreadsheet programs for straightforward calculations of limited volume -- too much frustration.

-- KS


#10

Try =(0.3-0.1-0.1-0.1). I get -2.77556E-17.

The reason for Excel's odd behavior is described in William Kahan's paper How Futile are Mindless Assessments of Roundoff in Floating-Point Computation ?:

Quote:
Apparently Excel rounds Cosmetically in a futile attempt to make Binary floating-point appear
to be Decimal. This is why Excel confers supernatural powers upon some (not all) parentheses.

He recommends they re-design Excel to use decimal arithmetic:

Quote:

Decimal has the great advantage that, if enough
digits are displayed, What You See is What You Get. Some day, perhaps, IBM’s LOTUS 123
spreadsheet may come out with decimal floating-point carrying 34 sig. dec.; if then Microsoft’s
Excel imitates (instead of “innovates”), its mysteries will become vastly fewer.


Note that the WYSIWYG principle is lost not just when using binary arithmetic, but also when using hidden digits as the TIs do. Apparent precision is gained at the cost of predictability.


#11

Gunnar --

Thank you for the link and response. This particular article by Kahan is fairly new (just more than a year old), and I had not seen it before. I didn't look at the date until I saw the quote from the article, "What's in your spreadsheet?" This is a twist on the catchphrase of a contemporary US television advertising campaign for a credit card: "What's in your wallet?"

Professor Kahan really ought to be more concise; this article runs 56 pages. Section 2 of the article describes the strange behavior of MS Excel 2000, which probably hasn't changed in Excel 2003, and perhaps not even in the latest version.

Section 5 tabulates "J-M. Muller’s Recurrence":

xn = 5 – 2/(1 + (5/3)n)

for which practically any BCD calculator gives accurate and best-possible results to the limits of its display, to n = 75 and beyond, long after the results from software utilizing floating-point math with 64-bit and 53-bit mantissas have diverged from the realm of numerical correctness.

-- KS


Edited: 10 Feb 2007, 5:04 p.m.


#12

Try filling a short column with -5, -4,-3,-2,-1, 0, 1, 2, 3, 4, 5.

Find the standard deviation = 3.02765... So far so good.

Now if you have an older version, up to about '98 and I think 2000 as well, try this with 14,999,999,998 through 15,000,000,005. The correct answer should be the same but the spreadsheet will come up with nonsense. This not due to the internal arithmetic so much as it is due to poor choice of algorithm. The algorithm is mathematically correct but when doing numerical computation all the significant data is rounded off.

BTW: This came up as a real world problem. Whipped out my HP-48SX and it had no trouble with this calculation.


#13

I get 3.317

#14

Quote:
Try filling a short column with -5, -4,-3,-2,-1, 0, 1, 2, 3, 4, 5.

Find the standard deviation = 3.02765... So far so good.


This looks like the sample standard deviation for the first ten values or the last ten. When you copied a range of cells into the formula, you may have missed one.

The correct population standard deviation is 3.16228

The correct sample standard deviation is 3.31662

MS Excel 2003 gives the correct answers with 14,999,999,995 through 15,000,000,005. Quattro Pro v12 calculates zero as either standard deviation for the large input data, as do calculators that use the summation method, instead of retaining each input datum and first calculating the mean therefrom (e.g., HP-17B/II, HP-27S, HP-28/48/49/50).

This was discussed in a thread from several years ago; Tom Sherman's post elaborated in detail.

http://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/archv014.cgi?read=53787#53787

I'm somewhat surprised that not all modern spreadsheet software can give the correct answer to this problem. The PC-based programs are already storing all the input data, and there's no shortage of processing power.

-- KS


Edited: 13 Feb 2007, 8:59 p.m.


#15

I probably mistyped something. But the main point is that Excel versions up through 98 or 2000 give totally bogus answers due to rounding off the significant data. It's more evidence that Micro Soft's primary objective is a new version with new window dressing rather than actually making software that works.

Another cute trick is finding the square root of -1.

I get: 6.12303176911189E-017+i

Come on, be serious! A rounding error for sqrt(-1)?

This was in Excel 2003 although I think all previous versions do the same.

#16

MS Excel 2002 (version 10.2614.2625) failed this test with:

Sample std dev = 228.973360895978

Population std dev = 186.181818181818

Stunning.

#17

Quote:
5.1156995306556 x 10E-15
This is a little O.T., but it looks like you're using the "E" incorrectly. I think you mean 5.1156995306556E-15. 1,000 = 1E3 = 10^3 = 10E2. "E" already means "times ten to the ___ power," so to say "x 10E__" puts you off by a power of ten. Also, 2E0=2, 2E1=20, 2E2=200, etc. Some engineering students and technicians at a company I did some consulting for years ago called me up with some simple calculations where things weren't coming out right. They had the presence of mind to know their results were way off, but they couldn't figure out why. This "E" thing is what messed them up.

#18

Hi, Garth --

Yes, you are correct -- I should stick with orthodox notation to prevent any possible confusion. It's just that

"5.1156995306556 x 10E-15"

looked better than the cluttered

"5.1156995306556 x 10^(-15)

or the run-together

"5.1156995306556E-15"

I was too lazy to superscript each value:

"5.1156995306556 x 10-15

which can also be easily corrupted if responders quote without manually restoring the formatting.

-- KS


#19

Quote:
or the run-together

"5.1156995306556E-15"

-- KS


What's wrong with that? That's just how it looks or our beloved HP calculators.

As long as we're picking on you, I would point that some would call these:

Quote:
  number  base-2 representation with aligned decimal points 

99.99 1100011.1111110101110000101000111101011100001010001111
.01 0.00000010100011110101110000101000111101011100001010001111011


"binary" points, not decimal points.


#20

Hi, Roger --

Quote:
"5.1156995306556E-15"

What's wrong with that? That's just how it looks or our beloved HP calculators.


True, but it's a shorthand notation -- blank-free by necessity -- developed to represent "extreme" numbers within limited displays.

Another problem with superscripting using preformatting is that the superscripted numbers look small.

Clarifications will be applied to my original post.

-- KS


#21

Quote:
Hi, Roger --

True, but it's a shorthand notation -- blank-free by necessity -- developed to represent "extreme" numbers within limited displays.

Another problem with superscripting using preformatting is that the superscripted numbers look small.

Clarifications will be applied to my original post.

-- KS


Why is it "blank-free" by necessity?

It wasn't developed to represent numbers within limited displays as far as I know. The notation has been used at least since Fortran when outputs were printed on 135 column wide printers. There was no space limitation. Calculators inherited the notation from those early computer languages.

The calculators with truly limited display space, such as the HP15 and HP41 didn't even have an "E"; they had a space.

The HP71's "PRINT USING" statement allows one to put several spaces before the "E", so space limitations aren't involved. But the default is the old Fortran style, 12.3456E78. Space limitations aren't a problem in the HP28, HP48, etc., but again the default is the Fortran notation.

So, I don't think space limitations had anything to do with its development; rather it was the lack of superscripts on the printers of the time.

--------------------------------

In your first post in this thread where your revision now says:

"base-2 representation with aligned radix"

I think you should say:

"base-2 representation with aligned radix points"

See:

http://en.wikipedia.org/wiki/Radix

http://en.wikipedia.org/wiki/Radix_point


#22

Quote:
Why is ("E"-notation) "blank-free" by necessity?

It wasn't developed to represent numbers within limited displays as far as I know. The notation has been used at least since Fortran when outputs were printed on 135 column wide printers. There was no space limitation. Calculators inherited the notation from those early computer languages.


That depends on the application. I was referring primarily to the entry of exponentiated numbers in computer-language code, not the display or printing of them. If a space were inserted betewen the mantissa and the "E" (or "D" for double precision), the Fortran compiler would unsuccessfully try to parse the number into an exponent and a variable.

I'm an old Fortran guy (too?), and I remember offhand that 132 characters was the standard limit for line-printer output.

Quote:
The calculators with truly limited display space, such as the HP15 and HP41 didn't even have an "E"; they had a space.

More specifically, the exponents on these models were right-justified to provide one or more spaces as the display setting permitted. Of course, with spaces, the "E" isn't even necessary, and would waste a valuable display position. I prefer the spaces to the full-size "E" used in the HP-20S, HP-21S, HP-32S, and others.

Apparently, so did other people at H-P; the HP-32SII uses a reduced-size "E" (as well as an elevated "-" for negative exponents) in the display to help distinguish exponentiated numbers. This was more important in the HP-32SII, with its Equation capabilities that utilized variables denoted by single capital letters. That's attention to detail.

BTW, when I stated "limited displays", I was also referring to lack of sub- and superscripting as well as length of display windows.


Quote:
I think you should say:

"base-2 representation with aligned radix points"


Right you are. I knew of "radix" only from the HP-42S RDX functions which set the style of radix point. By golly, there's a Wikipedia entry for everything!

-- KS


#23

Quote:
That depends on the application. I was referring primarily to the entry of exponentiated numbers in computer-language code, not the display or printing of them.

Too bad you didn't say that. Then my comments could have been relevant to your intent.

Quote:
If a space were inserted betewen the mantissa and the "E" (or "D" for double precision), the Fortran compiler would unsuccessfully try to parse the number into an exponent and a variable.

This isn't "necessity"; this is convention. The parser could easily accept free-form input, including spaces, and use comma, semicolon or CRLF as the terminator. I don't have any Fortran manuals anymore, and I don't know what free-form capability may have been available in any particular Fortran compiler.

But I like to refer to the HP71's BASIC because it is one of the most capable machines I've ever seen as far as input/output formatting goes. It has a built-in free-form input capability for HPIL input. For ordinary input, numbers can be input as strings and there are string functions that allow the user to remove superfluous spaces, or any other character, and then convert to a number in a variable with the VAL function. The user can do their own free-form input with this capability.

Therefore, I don't agree that the "E" format is "blank-free by necessity".

Quote:
I'm an old Fortran guy (too?), and I remember offhand that 132 characters was the standard limit for line-printer output.

Must have been a different printer.

Quote:
More specifically, the exponents on these models were right-justified to provide one or more spaces as the display setting permitted. Of course, with spaces, the "E" isn't even necessary, and would waste a valuable display position.

Why would an "E" in an exponential format number waste a display position, but a space wouldn't? Wouldn't the "E" be in the same position as the space, occupying one character position, just as the space does?

Quote:
I prefer the spaces to the full-size "E" used in the HP-20S, HP-21S, HP-32S, and others.

Apparently, so did other people at H-P; the HP-32SII uses a reduced-size "E" (as well as an elevated "-" for negative exponents) in the display to help distinguish exponentiated numbers.


I prefer the "E" of whatever size. The space is ok on a small size calculator display like the HP-41, but if it is printed on paper, the space rather than "E" will make it look like two non-exponential numbers rather than one "E" format number, e.g.:

12.345E26 with spaces becomes 12.345 26

I like to have the same format on the calculator display as when it's printed on paper. I'm used to the "E" format on both paper and a calculator display; it looks the same both places.

My TI-86 uses a small size "E" and elevated "-"; it's ok. Maybe those "other people" at H-P are an invasion force from TI (or some Japanese manufacturer of calculators). :-)

Quote:
BTW, when I stated "limited displays", I was also referring to lack of sub- and superscripting as well as length of display windows.

What you said earlier was that the "E" notation was "...developed to represent "extreme" numbers within limited displays."

To speak of "displays" like that sure sounds like you were referring to something like a calculator display such as found on a hand-held calculator. When the "E" format was developed, there weren't any such "displays"; there was printed output. There was no "length" limitation on printed "E" format numbers such as there is on a calculator display.

I said in my reply:

Quote:
So, I don't think space limitations had anything to do with its development; rather it was the lack of superscripts on the printers of the time.

To be perfectly clear, I think that to the extent you were suggesting that length limitations had anything to do with the development of the "E" format, you were mistaken.

I think it was solely the lack of superscripts on the printers of the time. On that point, you and I are in agreement.

As far as the use of the compact "E" format nowadays, it seems that you don't like it, but I do; I'm used to it.

We all have our preferences.

#24

A while back when the HP30S came out, there was some discussion about what its internal workings were. I posted a long item on the newsgroup. You can use Microsoft's BASIC to play around with binary representations. Here is what I posted:

Quote:
I think I can show that the 30S is indeed using binary arithmetic internally.

Fire up one of the old Microsoft GWBASIC versions and use the following little program to
see the behavior of internal binary number representations. Some of the behavior which
would otherwise seem peculiar derives directly from conversions from binary to decimal,
and vice versa.

10 a$ = MKD$(.1)
15 GOSUB 100
20 FOR I = LEN(a$) TO 1 STEP -1
30 PRINT RIGHT$("0" + HEX$(ASC(MID$(a$, I))), 2);
40 NEXT I
50 END
100 GOTO 130
110 MID$(a$, 1) = CHR$(0)
120 MID$(a$, 2) = CHR$(0)
130 PRINT
140 PRINT CVD(a$);
150 PRINT " ";
200 RETURN

The Microsoft Basics use binary arithmetic internally, and there are two floating point
precisions available, single precision and double precision. The single precision uses 4
bytes, 1 for the exponent and its sign, and 3 for the mantissa and its sign. There were
some changes when Quick Basic came out, but I believe the GWBASIC assumed that since the
mantissa is normalized, the most significant bit (MSB) of the mantissa need not be stored
(since in a normalized mantissa, it is always a one; the special case of floating point
zero is represented as an all zero exponent and all zero mantissa). The bit actually
stored in the place of the MSB of the mantissa is the sign of the mantissa. Double
precision uses 8 bytes, 1 for the exponent and its sign, and 7 for the mantissa and its
sign. Thus, single precision means 24 bit precision, and double precision means 56 bit
precision (mantissa precision).

Run the program and see the following result:

.1000000014901161  7D4CCCCD00000000

The first value is the result of putting a single precision value .1 in a double
precision variable. The second item, a hex string, is the internal representation of the
number. Notice that the last 4 bytes are all zeroes, which is what happens when you store
a single precision value in a double precision variable.

The quantity in parentheses in line 10 is .1 as a single precision quantity. Now place
a # after the .1, so line 10 becomes:

10 a$ = MKD$(.1#)

Now we are putting a double precision .1 in the double precision variable. By the way,
there is no explicit double precision variable being used here; the MKD$ function creates
a temporary double precision variable and then converts it to a string which is moved to
the string variable a$ for display.

Run the program and see the following:

 .1  7D4CCCCCCCCCCCCD

Now we see the internal double precision (56 bit mantissa) representation of .1--notice
the value of .1 converted to binary has a repeating mantissa, CCCCCCCCC..., and that the
last nib has been rounded up.

Now put the value Joe Horn mentions in another posting as the value the 30S gets for
SQRT(2), namely: 1. 41421 35623 71514 85402 13689, in line 10, thus:

10 a$=MKD$(1.4142135623715148540213689#)

Run the program and get the following:

1.414213562371515  813504333F8FFFF

Notice all the trailing F's. If this binary quantity were rounded up by adding only one
bit in the least significant place, we would have 813504333F90000. Let's see if we can
coax the program into showing us what the floating point value of that binary (represented
as hex here for space saving) string would be. For this we need lines 110 and 120.
Replace line 100 with: 100 REM

Run the program and see:

1.414213562371515  813504F333F90000

GWBASIC converts the 40 bit string 813504F333F90000 (we have 56 bits in the mantissa, but
the last 16 are all zeroes) into the numeric value 1.414213562371515

If the y to the x function on the 30S is used to calculate 2 to the .5, which is also
equal to SQRT(2), we get 1.41421356237309503445232. At this point, I can't use the
GWBASIC program any longer for the next step, since BASIC only has 56 bits in double
precision. Fortunately, Mathematica comes to the rescue. Converting this floating point
value to hexadecimal, we get: B504F333F9DE640000 (without exponent here).

Plainly, this is a mantissa which has been truncated to 56 bits.

If two 12 digit numbers are multiplied on the 30S, such as:

123456789123*987654321999, we get: 121932631357450082816877 which is exactly the correct
result. Similarly for division, addition, and subtraction.

Having reached this point, I remembered the properties of the math coprocessors that used
to be available as separate devices before they were incorporated into the Pentium CPU's.
There were 3 levels of precision available; 24 bit single precision, 56 bit double
precision and 80 bit double extended precision. It seemed possible that the 30S has a
modern CPU with built-in math coprocessor. Those processors provided math functions such
as sqr, sin, cos, log, exp, etc., as well as the 4 basic arithmetic functions +, -, *, /.
The basic four could be done in all 3 precisions, but the higher functions were available
only to a maximum of 56 bits.

So, I thought, let's check some of the other functions. Uh-oh, Ln(2) returns a result
correct to 24 digits, more than is possible with 56 bit arithmetic.

But, Exp(10) gives 22026.465794806716075982, correct to 17 digits, which converts to a
hexadecimal mantissa of 2B053B9F2A0ABF000000 which is a truncated 56 bit value.

(By the way, a binary mantissa of N bits is equivalent to a decimal number of:

N Ln(2)
-------- or N/3.32193 digits, so 80 bits give 80/3.32193= 24.08 digits
Ln(10)

40 bits give 12.04 decimal digits, and 56 bits give 16.85 digits)

Sin(60) (degrees, that is) gives 19 correct digits. Darn.

So, to summarize:

We can often tell how many bits of binary arithmetic are being used by converting a
decimal result back to binary (hex), and seeing if all the bits past a certain point are
truncated. This shows that some of the error seen in decimal results is directly
attributable to truncation error in the binary result, which when converted to decimal is
not recognizable by simple inspection as truncation any more.

The basic four arithmetic functions seem to use 80 bit arithmetic.

SQRT seems to truncate to 40 bits (12 digits)

The built-in keyboard constant Pi is 24 digits.

Other functions don't seem to consistently give a 56 bit truncated result.

Joe Horn is quite right that the calculator will accept 13 digits on input.



Possibly Related Threads...
Thread Author Replies Views Last Post
  AFTER HP-Prime update, AA1000 DOES NOT CLEAR THE SPREADSHEET Joseph Ec 0 454 12-06-2013, 11:28 PM
Last Post: Joseph Ec
  HP Prime: =f(x):=expression in spreadsheet app CR Haeger 1 543 12-05-2013, 07:53 AM
Last Post: cyrille de Brébisson
  HP Prime Spreadsheet Copy bug Michael de Estrada 1 567 12-03-2013, 11:34 PM
Last Post: Walter B
  HP Prime: How do I insert/delete a column/row in a spreadsheet ? Michael de Estrada 3 733 11-20-2013, 10:01 PM
Last Post: Michael de Estrada
  HP Prime - spreadsheet import bluesun08 1 520 11-19-2013, 03:21 PM
Last Post: CR Haeger
  MS advert shows spreadsheet with obvious error BruceH 3 721 11-14-2013, 09:50 AM
Last Post: Bill (Smithville, NJ)
  HP Prime - CAS functions in Spreadsheet App CR Haeger 6 884 11-11-2013, 12:37 AM
Last Post: Michael de Estrada
  10bii Financial Calculator is on launch sale - 0.99$ only John 4 685 11-10-2013, 08:02 PM
Last Post: BShoring
  HP Prime - Spreadsheet obscurity bluesun08 3 595 11-10-2013, 07:12 PM
Last Post: Joe Horn
  HP Prime - Spreadsheet lacks bluesun08 2 496 11-10-2013, 03:44 PM
Last Post: bluesun08

Forum Jump: