Summary: Normal CDF and quantile function  Dieter  05122012
After all these discussions on the Normal CDF and quantile function it might be useful to sum up the essential points.
 The Normal CDF is a key function and should be returned with maximum possible accuracy, in single precision as well as in in double precision mode. Also the function should work as fast as possible  this is an essential point if the quantile is calculated as a user code XROM function which has to call the CDF at least two or three times. So the Normal CDF should be coded in C with 39digit precision to return 34 valid digits (in DP mode) to the user.
 The CDF uses two different calculation methods: a series expansion for the central region and a continued fraction method in the tails. The split between both methods is at x = 2,326. The series expansion is best evaluated from right to left, so the number of requried terms has to be known in advance. This is no problem since the number of iterations for x > 2,326 can be obtained very precisely this way:
Full precision: n >= 2200/x^2 + 20 // rel. error < 1E39
Single precision: n >= 470/x^2 + 11 // rel. error < 1E18
Or, if this looks better:
Single precision: n >= max(130/x, 11) // rel. error < 1E18
 The CDF will run substantially faster if it is evaluated only up to the required accuracy. There is no need for 39 digits if the result in a manual calculation or in a standard precision user program has only 16 digits. The continued fraction method may simply calculate the number or required terms differently (cf. single precision formula above). But also the series expansion in the center may finish as soon as the last two iterations agree in 16 or maybe 17...18 digits. This will speed up the execution by a factor of almost two.
Example: x = 1 requires just 16 loops for SP vs. 30 for DP. At x = 2 it is 25 vs. 43.
 The quantile function may be written in user code (XROM). With careful programming an accuracy of roughly +/ 1 unit in the 33th digit seems possible. If 34+ digits are desired, the only way is a Ccoded function based on 39 digits internal precision.
 This accuracy level  up to 33 digits  is possible if there is a way to get the expression CDF(x)  p, that is used within the iterative refinement routine, evaluated with sufficient precision even if p is very close to 0.5 and thus x close to 0. The standard approach will eventually return zero even if the result is not yet sufficiently exact.
This problem can be solved with a helper function that can easily be derived from the current (Ccode) CDF routine. With 39 digits we should be on the safe side. This helper function may directly return the value of (CDF(x)  p)/PDF(x). This will yield a few more digits than the current solution that returns two values a and b to the stack, each limited to 34 digits.
 There are other solutions for the same problem (p close to 0.5). One of them is the Chisquare quantile function that may be applied after a simple variable transformation (pp = 12p) as suggested by Les. This can also be done for the Student quantile, using the inverse F with df1 = 1.
 I have not checked the latest firmware builds yet, but from what I hear there seems to be a problem with the Normal quantile function suffering from significant loss in accuracy (less than 20 digits in DP mode). Pauli, can you check this please? Honestly, I have no idea where exactly to look for the code in order to check for any errors. ;)
Dieter
Re: Summary: Normal CDF and quantile function  Paul Dale  05122012
Quote: I have not checked the latest firmware builds yet, but from what I hear there seems to be a problem with the Normal quantile function suffering from significant loss in accuracy (less than 20 digits in DP mode). Pauli, can you check this please? Honestly, I have no idea where exactly to look for the code in order to check for any errors. ;)
I noticed this too. Relative errors of something like 1e10 or less were cropping up :(
I'm using the incomplete gamma functions for the normal CDF at the moment  this frees up three pages of flash :) for a speed penalty in the QF :(
 Pauli
Re: Summary: Normal CDF and quantile function  Les Wright  05122012
Quote:
I'm using the incomplete gamma functions for the normal CDF at the moment  this frees up three pages of flash :) for a speed penalty in the QF :(
An inefficiency of using the IGAMMAs repeatedly to compute the the CDF for the QF is that they rely on regularized incomplete gammas, so the regularizing denominator in this case, which never changes from GAMMA(0.5) = sqrt(Pi), is computed afresh every single time. as opposed to being passed to the internal gser and gcf as a precomputed constant. This could be a bit of time suck, as well as a possible source of rounding error.
My own usercode incomplete gammas, left and right, are not regularized, so in computing the normal CDF and upper tail I divide through by sqrt(Pi). Using a gamma function to compute sqrt(Pi) is obviously not efficient.
Of course, these concerns don't apply to the chisquare, where there regularizing denominator varies according to the number of df, and needs to be calculated. But still, if the chisquare CDF or upper tail is computed repeatedly in refining the inverse quantile, does that denominator of GAMMA(df/2), which does not vary, need to be computed anew in each loop?
Just thinking aloud here. The usual speedspaceaccuracy tradeoff...
Edited: 13 May 2012, 2:07 a.m.
Re: Summary: Normal CDF and quantile function  Paul Dale  05132012
The (nonregularised) incomplete gamma function will have to compute the regularising denominator half the time to switch sides accurately  so yes it is a time sink always doing so, but not necessarily a source of error.
 Pauli
