![]() |
[WP34s] Parallel function - Printable Version +- HP Forums (https://archived.hpcalc.org/museumforum) +-- Forum: HP Museum Forums (https://archived.hpcalc.org/museumforum/forum-1.html) +--- Forum: Old HP Forum Archives (https://archived.hpcalc.org/museumforum/forum-2.html) +--- Thread: [WP34s] Parallel function (/thread-233575.html) |
[WP34s] Parallel function - Dieter - 11-04-2012 As far as I can see the current implementation of the 34s parallel function (cf. sourceforge.net) works this way: x*yThis may lead to overflow as soon as the product in the nominator becomes too large, even if the result falls within the working range. And for (almost) any positive x, y it will. The current code also checks whether x*y is zero in order to test if either x or y (or both) are zero -- in this case the returned result is zero as well. For very small values of x and/or y this may lead to underflow, returning a plain zero where the true result actually is very small, but not zero.
Example: x y 34s || trueThere are numerous ways to compute the parallel function, and there usually is one best way to do it for a certain combination of the two arguments. Not all, but most overflow/underflow problems can be solved if the function would work this way: 1That's why I would suggest an update of the parallel function. The code above is just a first example -- I am sure it can be done better (e.g. without slight rounding errors in the last digit). What do you think? Dieter
Edited: 4 Nov 2012, 6:42 a.m.
Re: [WP34s] Parallel function - Marcus von Cube, Germany - 11-04-2012 The problem with this implementation is that none of the arguments may be zero (which is valid and should return zero).
Re: [WP34s] Parallel function - Dieter - 11-04-2012 Please take a look at the suggested code, especially line 002/003 and 006/007. Of course any of the two arguments may be zero -- in this case zero is returned. That's the same behaviour as in the current implementation and how you say it is supposed to work ("should return zero").
Dieter
Re: [WP34s] Parallel function - fhub - 11-04-2012 Yes, it returns zero, but the stack (and LastX) is not correctly set.
Franz
Re: [WP34s] Parallel function - Dieter - 11-04-2012 The suggested code is not intended to replace the current implementation literally. It's a suggestion for a different way to compute the parallel function, here in the way it could be done in a standard 34s user program. An XROM routine with a similar approach would use the respective standard commands on entry (XIN DYADIC) Once again: the idea here is just the suggestion of a different way to compute the parallel function, thus overcoming the current limitations. Maybe (probably) there is an even better solution - what's your idea here ?-)
Dieter
Re: [WP34s] Parallel function - Walter B - 11-04-2012 Franz, an XROM routine following the line Dieter suggested would return a proper stack and Lastx AFAIUI.
Re: [WP34s] Parallel function - Marcus von Cube, Germany - 11-04-2012 Agreed, I just did look at the formula, not the suggested implementation.
I'm pretty sure that Pauli had a specific reason to implement || the way it is now. XROM executes in double precision with an extended exponent range so exponent overflow isn't really an issue here.
Re: [WP34s] Parallel function - Dieter - 11-04-2012 Marcus, Quote:That's only true if the 34s is set to standard precision. Yes, for XROM routines in double precision the working range exceeds 1E+6000, which is more than enough to handle every input in standard precision, i.e. up to 1E+384. However, the user may just as well have the device set to DP, et voila...
Please take a look at the two examples I posted: values like 1E+4000 can be easily entered in DP mode - and will lead to overflow resp. underflow, just as shown there. [DBLON]Dieter Re: [WP34s] Parallel function - Paul Dale - 11-04-2012 When I implemented these two in xrom, I didn't consider the edge cases in double precision -- the code as is will work fine for single precision which was my main concern at the time. We've always said double precision accuracy is not guaranteed. This is still the case. Double precision was intended for internal implementation and exposed at the request of this community. The previous C version would have worked in double precision too -- it has a much larger exponent range again. Still, I don't mind improving our double precision performance if the cost is low which it seems to be here.
Re: [WP34s] Parallel function - Marcus von Cube, Germany - 11-04-2012 I doubt that 10^4000 is a meaningful physical dimension... Exponents up to 999 can be entered directly in DP and they will not lead to an overflow here.
I'm still not sure why Pauli decided to use the present algorithm but I assume he had a cancellation issue in mind.
Re: [WP34s] Parallel function - Dieter - 11-04-2012 As usual, it's a tradeoff: the current method is potentially a bit more accurate, at least in the last digit. Try x = y = 15 and the method I posted will be +2 ULP off (7.500....002 instead of 7.5). So the best possible implementation would use (at least) two different ways to compute the result, for instance depending on x*y returning "infinity" or not. ;-)
Dieter
Re: [WP34s] Parallel function - Paul Dale - 11-04-2012 It would also be necessary to check x*y under flowing to zero.
Re: [WP34s] Parallel function - Paul Dale - 11-04-2012 Performance actually -- x*y / (x+y) is faster to calculate than 1 / (1/x+1/y) and given that full accuracy in double precision wasn't a requirement, I chose the former.
Re: [WP34s] Parallel function - Dieter - 11-04-2012 Ok, what about this one: 001 LBL"PAR"Just a thought, without any further testing. ;-)
Dieter
Re: [WP34s] Parallel function - Paul Dale - 11-04-2012 It is worse than needing a comparison against zero -- if x*y goes subnormal it is possible to have as few as one significant digit in the product -- better to switch formulas before this point I think. It is starting to seem not worth the effort...
Re: [WP34s] Parallel function - Werner - 11-05-2012 Or, the best of both worlds: *LBL"PAR"
(code 42S-style, as I don't own a 34S. Sadly, perhaps ;-)
Re: [WP34s] Parallel function - Paul Dale - 11-05-2012 This would suffer the same problems as my original code -- it uses the same formula. 42S style isn't far removed from 34S -- the 42S was a guiding light for the project in many ways.
Re: [WP34s] Parallel function - Werner - 11-05-2012 No, it doesn't. Try it. Werner
perhaps too short a reply: it does use the same formula, but a different order of operations, and avoids overflow. Try it Edited: 5 Nov 2012, 2:52 a.m.
Re: [WP34s] Parallel function - Paul Dale - 11-05-2012 Yeah, it avoids the overflow, not thinking straight a the moment. It probably needs a test for x = -y to produce a -infinity result but this is easy. Thanks.
Re: [WP34s] Parallel function - Dieter - 11-05-2012 Great - I knew there was a better way to do this. ;-) However, the code may behave differently depending on the order of the two arguments, i.e. whether X or Y is the larger value. Since par(x,y) = par(y,x) I think this should be avoided. I am not quite sure, but maybe the code works best if X is the larger value (avoids underflow here and there). A simple x<y? x<>y could do the trick. At least it returns consistent results for par(x,y) and par(y,x).
BTW - during these tests with very large numbers like 106000 I noticed a quite ..."special behaviour" of the 10x function. Usually integer arguments should return exact powers of ten, but somewhere there is a point where - in DP mode - this is no longer guaranteed: 500 [10x] => 1.00000...0 E+500
Dieter
Re: [WP34s] Parallel function - Walter B - 11-05-2012 I feared such academic problems would arise. We should have left DP closed to the public - but we didn't. Perhaps we shall rethink that decision ...
Re: [WP34s] Parallel function - Marcus von Cube, Germany - 11-05-2012 We do not special case 10integer. The error you mention is 2 ULPs. I think we can live with that.
Re: [WP34s] Parallel function - Dieter - 11-08-2012 Let's get back to the facts.
Dieter
Re: [WP34s] Parallel function - Paul Dale - 11-08-2012 You're correct I'm not using the builtin log2 and log10 constants for 2x and 10x :-( I missed this optimisation unfortunately. I do use them for log2 and log10 at least. Now to see if the code can be easily changed to pass in the log of base....
Re: [WP34s] Parallel function - Paul Dale - 11-08-2012 The next build will include this optimisation :)
- Pauli
Re: [WP34s] Parallel function - Walter B - 11-09-2012 Thanks for the details. As you see, this detailed report triggered some improvement, especially points 3 and 4 :-) No doubt the original error report wasn't the problem, but it wasn't leading to a solution yet - the second report was far better in that aspect. Thanks again.
BTW, DP mode is covered in an appendix of the manual - no easy way to hide it even more ;-)
|