WP 34S Complex trig/hyperbolic bugs?



#37

I finally got around to putting together a WP 34S. I would like to express my thanks and gratitude to Walter, Pauli and Marcus for the many, many hours that they have spent making this such an exceptional calculator. Fantastic work guys! Thanks also to Gene and Eric for providing the necessary cable and keyboard overlay and to the HP development team for opening the HP-30b platform.

I’ve been using the WP 34S for just a couple of weeks, and it is now my everyday calculator. I am most impressed with are the functionality contained in the software and the thought and detail that has gone into the usability of the platform. It is really quick and easy to operate – no extraneous key presses required. I also like the full and easy to use implementation of the HP-16C functionality, which I find particularly useful.

While using the calculator, I did come across a few bugs in some of the complex trig and hyperbolic functions (these are in the 2880 version of the firmware, but appear to have been there for a while). In particular, the following complex functions don’t return the correct values:

TAN
ASIN
ACOS
ASINH
For example. the result of:
ASIN (0.3 +0.6i)
should be:
0.257758589 + 0.586322939i
However, the WP 34S returns:
0.23676839899 + 0.60106916301i
The complex ASIN, ACOS and ASINH errors appear to be due to a bug in the computation of the complex ASINH function (which is used by the complex ASIN and complex ACOS functions) in the file complex.wp34s in the xrom directory. The statement on line 181 is:

RCL[times] X

It should probably be.

[cmplx]RCL[times] X

The complex TAN error is due to an error in the cmplxTan function in complex.c. The value returned in cb on line 390 should be negated (perhaps via the dn_minus() function) before being passed to the cmplxDivide function on line 392.

Best regards,

Eamonn.

Edited: 25 Apr 2012, 9:41 p.m.


#38

This is the kind of bug reporting I like :-) Thanks.
I'll get on to this in a couple of hours.


- Pauli

#39

The changes are in and will be included in the next build.

- Pauli


#40

Quote:
The changes are in and will be included in the next build.

These are indeed the most serious and unfortunately hardest to find errors at all! :-(

If we can't trust the numerical results of a calculator then it's really useless.

I remember that some time ago there has been a file wp34test.exe that seemed to made lots of tests (I don't know what kind of tests), but it hasn't been updated for a long time.

Is this wp34test.exe something that could be used for such numerical correctness tests?

Franz

Edited: 26 Apr 2012, 6:45 a.m.


#41

That program could be modified to test more functions if required. I don't know how difficult it would be -- it is Windows only and I don't build on Windows. Rewriting or modifying it to be cross platform would be required.

Better, I think, would be some automated unit testing. A suite of keystroke programs that run through the mathematical functions and exercise them for various predetermined cases and some random values. Any takers? We'd load this either into the calculator or one of the emulators and run the lot.


The arcsine issue is a result of recent space saving conversions of built in functions from C to xrom and insufficient testing on my part -- I won't be surprised if more such problems exist. The tangent issue is much longer standing and should have been caught earlier :-(

I know my testing and validation is relatively shallow, there is a
lot of code in the 34S and I've only so much time to dedicate to it.


- Pauli


#42

Quote:
Better, I think, would be some automated unit testing. A suite of keystroke programs that run through the mathematical functions and exercise them for various predetermined cases and some random values. Any takers? We'd load this either into the calculator or one of the emulators and run the lot.

Well, I guess this will be quite impossible considering the limited space for user programs. Such a program would need to test a few hundred functions, each of them with many different argument values, and then - how to be sure if the results are correct?

OTOH additionally entering all those correct values in such a program for comparision with the calculated result is again an impossible task - you know how many bytes (program steps) a real value need in a WP34s program!

Franz


#43

Hmmmh, Franz, first you demand such tests and then you say it's impossible. What do you really, really want?


#44

Quote:
Hmmmh, Franz, first you demand such tests and then you say it's impossible. What do you really, really want?

Well, you should read a bit more carefully, Walter! ;-)

I think that such tests could only be made with a separate EXE-program (written in C as the emulator itself), but certainly not as usercode program - the free space for this is just too small!

#45

The emulator is far less restrictive than the real device -- most testing would be done on the emulators by necessity. The test suite would be a series of library files and wouldn't have to rely on the device's limited RAM at all. They will be built by the assembler of course. This would have the added bonus of testing the programing functions as well as the mathematical ones.

Testing on the real hardware would be restricted to cases where the emulator is broken (i.e. never hopefully).


This definitely is possible. It is also a huge amount of work.


Real comparison values are space consuming -- registers and backup files? Extra internal constants in a test image? Multiple keystroke numeric entry command? I'm sure something can be worked out if required.


- Pauli


#46

Quote:
Real comparison values are space consuming

Are there any reliable(!) sources available on the internet for such testing purposes (i.e. or tables with lots of function arguments and results) which could be used?

Maybe you (with your experience in writing those numerical functions) have already found such sites!?

The problem is that I don't have any clue how/where to start from (which functions? which and how many arguments to test?), and without such an already existing testsuite I guess this would be a tremendous work.

Franz


#47

Accurate sources for function values aren't difficult to locate. Wolfram Alpha, Casio's Keisan are both good. Abramowitz and Stegun is available online and has good tables for the more esoteric functions, although not to the number of digits the 34S carries.

As for which function, that is the easy question to answer: all of them. Start with the more basic and work onto the complex.

As for arguments to test, this is where it gets interesting & difficult. Every function is different and the person writing the test really needs to know and understand the function being tested to make a good effort of it.

Taking SIN as an example. The boundary cases need to be tested -- e.g. SIN(0), SIN(90), SIN(180), SIN(270) and SIN(360). Some intermediate values are sensible, as are negative values. Additionally, we know SIN(x) -> x as x -> 0 so we'll want to test for a range of very small values of x as well. Then some huge arguments (5e200 e.g.). At this point there is a minimal test of the function done -- enough to catch gross errors at least.

A second example to highlight the difference the function makes. Take LN(x), the interesting point is around x=1, we mustn't lose accuracy there. But other values need to be tested including e, 0 and infinity. A couple of negatives to verify the domain error is properly produced is prudent too.

I also like to throw some random values at a function to increase the coverage. This requires knowing an alternative way to verify the function or employing an identity of some kind. An example of the latter would be: choose a random x, calculate SIN and COS, square both and add. The answer should be unity or very close. Often the function's inverse can be used to help validate the function itself. I.e. y = LN(x). Verify EXP(y) is close to x. It also helps to verify that y is not close to x if x is not close to e.

Finally, many function are periodic or have a reflection. It is worth verifying this for a range of values too. E.g. SIN(x + 2 * PI) is close to SIN(x) for x not very small or SIN(-x) is close to -SIN(x) for all x -- in fact SIN(-x) should be exactly -SIN(X) rather than close.


So, yes this is a tremendous amount of work to do properly.


- Pauli


#48

Quote:
So, yes this is a tremendous amount of work to do properly.

After reading what you expect from such tests I would say:

Replace 'tremendous' with 'impossible' and I'm with you. ;-)

I'm sure it will be much easier to just check the correctness of your code (in C or XROM) against the mathematical sources you've used (which can certainly be considered as correct) than doing all the tests you suggested.

With this method we could also find such errors/typos (like missing [cmplx] or overlooking a '-' sign) in the formulas or algorthms of the WP34s code.

Franz


#49

We found some errors in the mathematical sources used, so even this approach isn't perfect. The one that comes most strongly to mind is the series for the gamma function. The quote accuracies were incorrect & I didn't check originally -- the result, our gamma function was a lot worse than we realised. Not enough to break single precision but in double it was noticeable.


- Pauli


#50

Quote:
We found some errors in the mathematical sources used, so even this approach isn't perfect. The one that comes most strongly to mind is the series for the gamma function. The quote accuracies were incorrect & I didn't check originally -- the result, our gamma function was a lot worse than we realised. Not enough to break single precision but in double it was noticeable.

Pauli, this is a classic example of the need to be cautious in trusting even so-called gold standards.

I surmise from the source code that with gamma you generated constants for a convergent Lanczos approximation as reported in a note by Paul Godfrey and discussed online by Viktor Toth. When I used to play in Matlab a few years back I was disappointed to find that Godfrey's calculations only worked as well as he reported for integer values of the function argument. For non-integer values one had to use a longer series than Godfrey computed in order to achieve double precision machine accuracy, and even then there were the rounding issues. I even wrote Godfrey about this way back then and didn't hear back.

Looking at the present state of the gamma code, it seems to me that you have come across the work of Glen Pugh, who did his PhD dissertation on the Lanczos approximation. In his work, Pugh gives guidance on how to choose a certain parameter so as to maximize accuracy for a given number of terms. He sent me an updated version of the relevant tables some time back, but I lost that email in a hard drive crash. I am really glad that in implementing Lanczos you seem to have come across Pugh's work, given that the approximation is all too often employed inefficiently. Those 39-digit constants take up a lot of bytes, and you want to use only as many as you need and no more.

Les


#51

I originally started with Pugh's thesis. His table of constants and purported accuracy of the Lanczos results using them isn't correct. I don't have the thesis to hand so can't cite page numbers. I also don't remember for sure if the constants were at fault or the error analysis. Marcus did the bulk of the work on the modified version we're using, he might remember.


- Pauli


#52

Quote:
I originally started with Pugh's thesis. His table of constants and purported accuracy of the Lanczos results using them isn't correct. I don't have the thesis to hand so can't cite page numbers. I also don't remember for sure if the constants were at fault or the error analysis. Marcus did the bulk of the work on the modified version we're using, he might remember.

Pugh did revise and extend that table in later years. He sent it to me but I can't seem to find it. One thing I do remember is that one parameter that Lanczos himself designated as arbitrary in original paper can be optimally computed so that one achieves the highest accuracy possible with the number of terms desired.

You may know that Peter Luschny, who has a delightful site compiling various ways to compute gamma and factorial, doesn't recommend Lanczos, but rather an one of a number of divergent approximations combined, where necessary, with a Pochhammer shift with the real part of the argument is smaller.


#53

Quote:
You may know that Peter Luschny, who has a delightful site compiling various ways to compute gamma and factorial, doesn't recommend Lanczos, but rather an one of a number of divergent approximations combined, where necessary, with a Pochhammer shift with the real part of the argument is smaller.

I wasn't aware of this.


- Pauli


#54

Quote:


I wasn't aware of this.



Here.

Please note that on this very busy page, further down, Luschny acknowledges that the series he reports are actually interim results that Lanczos reports in the original paper, not the final refinements that are ultimately used and reported in the various sources (Numerical Recipes, Toth, Godfrey, Pugh).

#55

I tried to implement Pugh's algorithm in Derive and I could not verify his results. It may be that his constants aren't copied correctly or thata higher interim precision is needed. I gave up on this and resorted to Victor's arbitrary precision implementation which I used to determine the coefficients we are using now. I don't claim 34 digits accuracy but only a little less. If anybody wants to improve on the algorithm or can even give a hint what's wrong with Pugh's tables (or our old implementation, see the sources to V2), I'm a taker. If we can use coefficients with no more than 34 digits, this would be a considerable space saver.


#56

Quote:
It may be that his constants aren't copied correctly or thata higher interim precision is needed.

This is an issue with the Lanczos approximation, as it is with the similar Spouge approximation. I think there are some issues related to the alternating nature of the series and the resulting subtraction error. Pugh ran all of his tests with something called MPJAVA, and I think he maintained very high levels of interim precision. I know that when I have played with these things in Maple and later Mathematica high interim precision is needed.

Despite their flaws, the Lanczos and Spouge approximations do have the advantage of good accuracy for smaller argument values. Divergent approximations related to Stirling's classic improve with increasing argument value, requiring a Pochhammer shift for smaller arguments. I am sure that all of that extra computation may inspire loss of accuracy. due to accumulated rounding errors.


#57

More importantly for us, Lanczos is very small on code space. The constants add a bit but the end result is still fairly small. Gamma is one function that is used a lot throughout the other mathematical functions and needs to be quick and good.


- Pauli

#58

Quote:
Often the function's inverse can be used to help validate the function itself. I.e. y = LN(x). Verify EXP(y) is close to x. It also helps to verify that y is not close to x if x is not close to e.

That's how I came across the errors above. I was taking the complex sine of a value and then taking the inverse complex sine for which I expected to see the original value. When I got a different value I investigated further and tested all the complex trig and hyperbolic functions.

Automated testing would be great to have, but putting together a comprehensive test suite is probably never going to be as much fun as coding a state of the art calculator. The level of testing depends on what type of testing is to be done. Testing math functions to make sure that the error is never worse than a few ULPs is different to testing to make sure that a given keystroke sequence results in a given result.

For what it's worth, I compared the WP 34S results for complex trig and hyperbolic functions against a HP-15C, Free42 and Octave. For automated testing, a math package such as Octave already implements most (if not all) of the functions contained in the WP 34S and could also be used as a reference for validating the return values for random inputs to functions. It returns 16 digits of accuracy, which I believe is the number of digits returned on the WP 34S. It's free to use and there are no limits on the number of times you can use it to evaluate functions (unlike perhaps some of the online resources).

Thanks Pauli for implementing the fixes. I look forward to updating the software on my WP 34S. Given the scope of what you are doing, the level of the effort required to put this together and the fact that there are so few active developers working on this in basically your spare time, I think it is pretty impressive that the platform is as stable and relatively bug free as it is.

Best regards,

Eamonn.


#59

Have you found out about the DBLON command in CPX h MODE, the 'internal' catalogue? 16 digits isn't the end of it.


#60

Quote:
Have you found out about the DBLON command in CPX h MODE, the 'internal' catalogue? 16 digits isn't the end of it.

Are wrong/buggy results of the WP34s then also DBLwrong in this mode? ;-)

#61

Quote:

Are wrong/buggy results of the WP34s then also DBLwrong in this mode? ;-)


Yes. The calc's elementary math functions are computed internally to 39 digits and sometimes more, DBLON vs. DBLOFF only determines how many digits are returned to the stack an stored in registers.


#62

Quote:
Yes. The calc's elementary math functions are computed internally to 39 digits and sometimes more, DBLON vs. DBLOFF only determines how many digits are returned to the stack an stored in registers.

Damned - I thought DBLwrong=correct just as -(-a)=+a ;-)

Edited: 26 Apr 2012, 5:11 p.m.

#63

Yes, I did come across the DBLON/DBLOFF commands. However, in the manual it states that these are part of the command set for more advanced users that can ease some internal programming tasks, but require special care to use properly and are to be used at the users own risk.

Is the intent to ultimately support double precision arithmetic for all the functions? If the calculations are being carried to 39 digits internally then it would seem that 34 digit precision should be achievable across the range of values to be input to the function. However, 16 digit precision is a lot easier to achieve and is likely sufficient for most users (excuse me if this was discussed previously).

Eamonn.


#64

Quote:
Is the intent to ultimately support double precision arithmetic for all the functions?

All functions can already accept double precision arguments and produce double precision results. However, we're not planning on making double precision a first class user type like single precision is. We're simply not going to be able to produce correct results in double precision in all cases. E.g. several functions are implemented in user code (xrom), these will never achieve 34 digits of accuracy.

I originally added double precision in support of the matrix code where I couldn't hold all the values in our expanded format but needed better than single precision. From there it grew to include some of the statistical accumulations and keystroke programs to allow commands to be reimplemented in a space saving manner -- keystroke programs are half or less the size of their C equivalents.


Despite our warnings etc, double precision mode works very well and is quite usable. Just don't trust all the digits in the results.


- Pauli


#65

Quote:
Despite our warnings etc, double precision mode works very well and is quite usable. Just don't trust all the digits in the results.

I just tried the double precision mode and it does seem to work very well. For the functions I tried (mostly trig, ln, log, etc.) double precision does seem to produce 34 digit accurate results for typical input values. Quite remarkable IMHO.

Quote:
However, we're not planning on making double precision a first class user type like single precision is. We're simply not going to be able to produce correct results in double precision in all cases.

That makes sense. If you can achieve 16 decimal digit performance for all the functions in the WP 34S then that is an impressive accomplishment. Using 39 internal digits is kind of a big hammer approach to accomplishing this, but memory and processing resources are relatively cheap these days so it's a good way to go.

#66

The 39 digits came from the format of the decimal reals we're using. I needed 32 digits to handle complex multiplication correctly and the next natural sizes (i.e. fitting into even word boundaries) are 33 then 39 digits. 33 digits didn't leave enough guard digits for my liking so I chose 39. Fortunately really, because double precision has 34 digits and we've got a little headroom beyond that.

I've since increased the number of digits for some complex operations to allow double precision to have a chance of getting correct answers. I've also learned that I didn't need double the digits in the first place for the complex multiplication if I'd been less naive about the algorithm used.


- Pauli

#67

Quote:
For automated testing, a math package such as Octave already implements most (if not all) of the functions contained in the WP 34S and could also be used as a reference for validating the return values for random inputs to functions. It returns 16 digits of accuracy, which I believe is the number of digits returned on the WP 34S.

Octave isn't sufficient. It uses binary floats, we use decimal. Generally binary floats carry more precision but not always -- decimal floats are less granular and actually have a wider exponent range. Rounding is also different.

It might be possible to compare our single precision against a long double implementation. And certainly against one of the correctly rounded packages.

Anyway, I wasn't proposing test for accuracy down to the last digit, I was more suggesting gross correctness tests. Random test cases don't do the first, neither do exhaustive tests unless they really do cover each and every possible value. Validating accuracy requires much deep analysis of the algorithms and functions involved.

Quote:
Thanks Pauli for implementing the fixes.

Thank to you for the bug report that included the code changes required. This makes my life much easier.


- Pauli

#68

- could a test program retrieve comparison values from Wolfram Alpha, so that no tables would be needed?

- could this be done in some kind of 'crowd computing' effort, like SETI? Every forum member's WIN/Mac/Linux machine just going through a tiny portion of all possible values.

Edited: 26 Apr 2012, 8:20 a.m.


#69

Alexander, from what you wrote yesterday

Quote:
But 99% of WP-34s`s users only use +,-,/ and * in 99% of the time. ;-)

I would say it's enough to test those 4 basic arithmetic operations. ;-)

But having in mind the HP-35S buglist I would say we should definitely not use this calculator as a reference model. ;-)

Edited: 26 Apr 2012, 8:41 a.m.

#70

Quote:
- could a test program retrieve comparison values from Wolfram Alpha, so that no tables would be needed?

Yes it could. But for free, only up to 2000 values a month.

Quote:
- could this be done in some kind of 'crowd computing' effort, like SETI? Every forum member's WIN/Mac/Linux machine just going through a tiny portion of all possible values.

Sure but I'm not sure there is a real need for that.

Anyway, the real work is to describe the tests we want to do in any kind of "computer language", even one we can invent.

Once this is done, doing the tests should be easy.

#71

Quote:
- could a test program retrieve comparison values from Wolfram Alpha, so that no tables would be needed?

- could this be done in some kind of 'crowd computing' effort, like SETI? Every forum member's WIN/Mac/Linux machine just going through a tiny portion of all possible values.


Running the testcases is not the hard part. The hard part is determining what should be tested (coverage), building the framework and then writing those testcases.

That last bit is something that could be farmed out to the
community, but my guess is it would fail because:

  • I tried it with the on-line users' manual and out of the whole community only me and Steve ever wrote anything.
  • "99% of WP-34s`s users only use +,-,/ and * in 99% of the time. ;-)" - so that leaves 1% who would care!
:D

#72

It seems to me that the using the inverse functions would get a test suite up and running relatively quickly.

Given an input range for each function, run through a set of functions and the inverse and see if the output matches the input.

Assuming most function implementations don't depend on their inverse implementation, it would be better than no tests, and should catch something like these errors.

It is something I'd want to do in the emulator, though.


Possibly Related Threads…
Thread Author Replies Views Last Post
  [WP-34S] Unfortunate key damage with update to V3 :( svisvanatha 5 3,299 12-10-2013, 11:37 PM
Last Post: Les Bell
  WP-34S (Emulator Program Load/Save) Barry Mead 1 1,877 12-09-2013, 05:29 PM
Last Post: Marcus von Cube, Germany
  HP Prime: complex numbers in CAS. Alberto Candel 1 1,949 12-06-2013, 02:36 PM
Last Post: parisse
  [HP Prime] Plots containing complex numbers bug? Chris Pem10 7 3,716 12-05-2013, 07:40 AM
Last Post: cyrille de Brébisson
  DIY HP 30b WP 34s serial flash/programming cable Richard Wahl 2 2,605 12-04-2013, 11:14 AM
Last Post: Barry Mead
  Complex Number Entry on Prime Jeff O. 19 5,326 11-16-2013, 12:34 PM
Last Post: Jeff O.
  Digging for bugs Stefan Dröge (Germany) 2 1,592 11-13-2013, 04:39 PM
Last Post: Stefan Dröge (Germany)
  WP 34S/43 ?'s Richard Berler 3 2,130 11-10-2013, 02:27 AM
Last Post: Walter B
  My FrankenCulator (wp-34s) FORTIN Pascal 4 2,248 11-09-2013, 06:18 PM
Last Post: FORTIN Pascal
  WP 34S Owner's Handbook Walter B 5 2,755 11-09-2013, 05:34 PM
Last Post: Harald

Forum Jump: