▼
Posts: 1,368
Threads: 212
Joined: Dec 2006
Greetings all,
In my quest for a slick, compact, and as precise as possible way to estimate the upper tail cumulative normal distribution (which is simply the complementary error function with linear scalings of x and y axis), I have become preoccupied with the routine erfcc() in Section 6.2 of Numerical Recipes:
#include <math.h>
float erfcc(float x)
//Returns the complementary error function erfc(x) with fractional
//error everywhere less than
//1.2 × 10−7.
{
float t,z,ans;
z=fabs(x);
t=1.0/(1.0+0.5*z);
ans=t*exp(z*z1.26551223+t*(1.00002368+t*(0.37409196+t*(0.09678418+
t*(0.18628806+t*(0.27886807+t*(1.13520398+t*(1.48851587+
t*(0.82215223+t*0.17087277)))))))));
return x >= 0.0 ? ans : 2.0ans;
}
The authors say little about the routine's origins apart from noting that it is "based on Chebyshev fitting to an inspired guess as to the functional form." Save that, I can't seem to find it in any of the references cited in the chapter that I have ready access tooit isn't in Hastings, or Abramowitz and Stegun, or Acton, for exampleat least that I can find.
I am intrigued since the relative error is not bad given that the constants in the polynomial are only single precision. The mapping of the domain onto the bounded interval (0,1] is particularly helpful, I suspect, in attaining goodness of fit that there is.
I can't help but wonder that one could do a little better by redoing the approximation with double precision constants (lopping off a few digits when implementing the result in a 10 or 12 digit calc) or extending the polynomial by a term or two.
Does anyone know of the origin of this approximation? Is it an NR original whose derivation is not given in the book, or is it from some other source that the NR team hasn't explicitly referenced? I would like to work with it more, and just trying to do a blind fit in curvefitting software is getting me nowhere fastI can't even replicate the original fit, never mind extending it.
Grateful for wisdom in my quest for this grail. Is there a numerical analysis Professor Langdon in the group?
Les
▼
Posts: 1,755
Threads: 112
Joined: Jan 2005
Hi, Les:
Les posted:
"I can't help but wonder that one could do a little better by redoing the approximation with double precision constants (lopping off a few digits when implementing
the result in a 10 or 12 digit calc) ..."
That wouldn't do, since constants are given to 9 places in the formula and the alleged maximum error is 1.2E7. Using double precision constants would do nothing to significantly improve the error.
"... or extending the polynomial by a term or two."
That would definitely help, in which case the error might diminish to a point (say, 1.2E11) where 12digits constants would be indeed needed to realize it in full.
If I were you, I'd do the following:
 Using some adequate mathematical package or builtin function (INTEG if nothing else), I would find to 12 correct decimal places the values to polynomically fit, which would be the values of the function proper after 'taking out' the domain mapping and the
"inspired" functional form.
That would leave you with a table of values (or an adequate userdefined function, if you can do it all in your particular HP model) which would correspond to the polynomial part in the original formula, i.e., the P(x) = 1.26551223+t*(1.00002368+t*(... part.
 Then, I would simply do a true minimax polynomial fit (not an approximate one), of whatever desired degree, to 12decimal places. I would start by doing a minimax fit of the very same degree (9^{th}) as the present P(x), to see if the minimax fit does indeed replicate its coefficients, which would be proof enough of being in the right track. If it does *not* replicate the coefficients, I would test the new polynomial to find its maximum error and compare against the 1.2E7 alleged error for the original one. It should be equal or less. In any case, I would perform an automatic fit, to see what is the maximum degree which can be profitably used, i.e., you're fitting the actual data and not rounding errors.
This would also provide a table of the minimax error for degrees 9, 10, 11, 12 .... Your present polynomial is of degree 9. I think that degree 10 would perhaps achieve an error like 1E9 and degree 11 would achieve 1E11, rounding errors notwithstanding, but that's only an educated guess.
Try it and tell us how it went.
Best regards from V.
▼
Posts: 1,368
Threads: 212
Joined: Dec 2006
Thanks!
Another forum contributor directed me to an older version of SPSS TableCurve2D, which is impressively diverse and fast given that it is over a decade old. Regrettably, userdefined functions are restricted to a maximum of 10 parameters, and the NR approximation that intrigues me is up to that already.
I should share a little more about what I like about the NR routine and why I want to improve on it if I can. For esoteric reasons, I am preoccupied with accuracy and, hopefully, precision, in the upper tail of the distribution, where the function values get quite tiny. The reported 1.2e7 relative error does indeed seem to hold when I plot the error curve in Maple, and indeed the fact that it is a relative error, not absolute, does really assist the accuracy for the high values of the argument.
For example, on my HP48G, my implementation of the NR exponentiated polynomial approximation gives erfcc(33.82) = 3.01532436445E499, whereas 0 1 33.82 SQRT * UTPN 2 * gives 3.01532450586E499. This actually impresses and intrigues me greatly. The Hastings approximations quoted in Abramowitz and Stegun boast respectable absolute errors, but in the upper tail this is meaningless since tiny minus tiny equals tiny in absolute terms, but it can be huge in proportional terms when you calculate tiny divided by tiny to get something not tiny at all.
For example, 26.2.17 in Abramowitz and Stegun computes the cumulative normal distribution with an absolute max error of 7.5E8. Using the complement of this to estimate the upper tail probability on my on my HP48G, I get roughly 2.12E482 for input of 47, compared with roughly 1.78E482 for 0 1 47 UTPN. (This latter value matches up with the results in Maple and Mathematica as well as the relevant continued fraction expansion out to a few terms, so I do trust it.) Right order of magnitude, and tiny absolute error, but the relative error is an abysmal 1.9E1. Abramowitz and Stegun 26.2.17 does even more sadly in that upper tail, from what I understand from Rodger Rosenbaum who has done the number crunching.
Thanks for your feedback. I am learning I need to learn a bit more about minimax curve fitting and Chebyshev fits using transformed inputs. I am no computer scientist or mathematician, so this can take awhile. At least I will have fun in the process.
Cheers,
Les
▼
Posts: 1,368
Threads: 212
Joined: Dec 2006
Quote:
0 1 33.82 SQRT * UTPN 2 *
This should read 0 1 33.82 2 SQRT * UTPN 2 * .
erfc is derived from the standard (i.e., mean=0, variance=1) uppertail normal probability by scaling the input by a factor of sqrt(2) then doubling the result. Of course, if you are interested in these functions and are following this thread you already know that, so forgive my pedantry :)
Sorry if the typo created any confusion.
Les
Posts: 1,368
Threads: 212
Joined: Dec 2006
Quote:
Abramowitz and Stegun 26.2.17 does even more sadly in that upper tail
Another egregious typo. I meant 26.2.19.
Les
Posts: 776
Threads: 25
Joined: Jun 2007
Les,
re: "For esoteric reasons, I am preoccupied with accuracy and, hopefully, precision, in the upper tail of the distribution, where the function values get quite tiny."
Can you tell us the esoteric reasons? If this is for mathematical curiosity, I can understand it.
If it has anything to do with actual probablities, then I don't think you need to worry about it. Anything with probabilities like 1E482 (which I recall from some previous part of this discussion) have not, nor will not, happen during the lifetime of the universe!
Am I missing something here?
▼
Posts: 1,368
Threads: 212
Joined: Dec 2006
You are missing nothing. "Esoteric reasons" = "mathematical curiosity" and nothing more. Completely impractical intellectual dalliance with no envisioned practical implications whatsoever:)
Posts: 120
Threads: 9
Joined: Aug 2005
Yes but it could probably happen tomorrow if its probability is not zero.
