Summary: Normal CDF and quantile function
#1

After all these discussions on the Normal CDF and quantile function it might be useful to sum up the essential points.

  • The Normal CDF is a key function and should be returned with maximum possible accuracy, in single precision as well as in in double precision mode. Also the function should work as fast as possible - this is an essential point if the quantile is calculated as a user code XROM function which has to call the CDF at least two or three times. So the Normal CDF should be coded in C with 39-digit precision to return 34 valid digits (in DP mode) to the user.

  • The CDF uses two different calculation methods: a series expansion for the central region and a continued fraction method in the tails. The split between both methods is at x = 2,326. The series expansion is best evaluated from right to left, so the number of requried terms has to be known in advance. This is no problem since the number of iterations for x > 2,326 can be obtained very precisely this way:
      Full precision:   n >= 2200/x^2 + 20   // rel. error < 1E-39
    Single precision: n >= 470/x^2 + 11 // rel. error < 1E-18
    Or, if this looks better:
      Single precision: n >= max(130/x, 11)  // rel. error < 1E-18

  • The CDF will run substantially faster if it is evaluated only up to the required accuracy. There is no need for 39 digits if the result in a manual calculation or in a standard precision user program has only 16 digits. The continued fraction method may simply calculate the number or required terms differently (cf. single precision formula above). But also the series expansion in the center may finish as soon as the last two iterations agree in 16 or maybe 17...18 digits. This will speed up the execution by a factor of almost two.

    Example: x = 1 requires just 16 loops for SP vs. 30 for DP. At x = 2 it is 25 vs. 43.

  • The quantile function may be written in user code (XROM). With careful programming an accuracy of roughly +/- 1 unit in the 33th digit seems possible. If 34+ digits are desired, the only way is a C-coded function based on 39 digits internal precision.

  • This accuracy level - up to 33 digits - is possible if there is a way to get the expression CDF(x) - p, that is used within the iterative refinement routine, evaluated with sufficient precision even if p is very close to 0.5 and thus x close to 0. The standard approach will eventually return zero even if the result is not yet sufficiently exact.

    This problem can be solved with a helper function that can easily be derived from the current (C-code) CDF routine. With 39 digits we should be on the safe side. This helper function may directly return the value of (CDF(x) - p)/PDF(x). This will yield a few more digits than the current solution that returns two values a and b to the stack, each limited to 34 digits.

  • There are other solutions for the same problem (p close to 0.5). One of them is the Chi-square quantile function that may be applied after a simple variable transformation (pp = 1-2p) as suggested by Les. This can also be done for the Student quantile, using the inverse F with df1 = 1.

  • I have not checked the latest firmware builds yet, but from what I hear there seems to be a problem with the Normal quantile function suffering from significant loss in accuracy (less than 20 digits in DP mode). Pauli, can you check this please? Honestly, I have no idea where exactly to look for the code in order to check for any errors. ;-)

Dieter

#2

Quote:
I have not checked the latest firmware builds yet, but from what I hear there seems to be a problem with the Normal quantile function suffering from significant loss in accuracy (less than 20 digits in DP mode). Pauli, can you check this please? Honestly, I have no idea where exactly to look for the code in order to check for any errors. ;-)

I noticed this too. Relative errors of something like 1e-10 or less were cropping up :-(

I'm using the incomplete gamma functions for the normal CDF at the moment -- this frees up three pages of flash :-) for a speed penalty in the QF :-(


- Pauli

#3

Quote:

I'm using the incomplete gamma functions for the normal CDF at the moment -- this frees up three pages of flash :-) for a speed penalty in the QF :-(


An inefficiency of using the IGAMMAs repeatedly to compute the the CDF for the QF is that they rely on regularized incomplete gammas, so the regularizing denominator in this case, which never changes from GAMMA(0.5) = sqrt(Pi), is computed afresh every single time. as opposed to being passed to the internal gser and gcf as a precomputed constant. This could be a bit of time suck, as well as a possible source of rounding error.

My own user-code incomplete gammas, left and right, are not regularized, so in computing the normal CDF and upper tail I divide through by sqrt(Pi). Using a gamma function to compute sqrt(Pi) is obviously not efficient.

Of course, these concerns don't apply to the chi-square, where there regularizing denominator varies according to the number of df, and needs to be calculated. But still, if the chi-square CDF or upper tail is computed repeatedly in refining the inverse quantile, does that denominator of GAMMA(df/2), which does not vary, need to be computed anew in each loop?

Just thinking aloud here. The usual speed-space-accuracy tradeoff...


Edited: 13 May 2012, 2:07 a.m.

#4

The (non-regularised) incomplete gamma function will have to compute the regularising denominator half the time to switch sides accurately -- so yes it is a time sink always doing so, but not necessarily a source of error.

- Pauli



Possibly Related Threads...
Thread Author Replies Views Last Post
  HP50g: Writing a function that returns a function Chris de Castro 2 712 12-10-2013, 06:49 PM
Last Post: Han
  Yet another Normal quantile function for the 34s Dieter 13 1,093 08-06-2012, 09:15 PM
Last Post: Paul Dale
  [WP34S] Curious Bug in Inverse Normal Function Les Wright 61 3,346 05-01-2012, 02:44 AM
Last Post: Paul Dale
  [WP34S] Inverse t CDF throws "Domain Error" for probabilities close to 0.5 Les Wright 10 1,011 04-19-2012, 03:25 AM
Last Post: Paul Dale
  [WP34S] Bug in Inverse CDF for F-distribution Les Wright 19 1,667 04-15-2012, 11:17 AM
Last Post: Dieter
  32E's Normal & Inverse Normal Distribution Matt Agajanian 27 2,157 04-14-2012, 06:07 PM
Last Post: Dieter
  HP 32sII Integration Error of Standard Normal Curve Anthony (USA) 4 595 03-14-2012, 03:18 AM
Last Post: Nick_S
  A program for Normal Distribution on the 42s snaggs 5 632 09-22-2011, 02:19 PM
Last Post: Dieter
  Already forgot how to use a normal calculator... Brian K 13 1,167 07-26-2011, 03:02 PM
Last Post: Chris Randle (UK)
  A first estimate for Student's t quantile (not only) on the WP34s Dieter 30 2,099 05-07-2011, 09:11 PM
Last Post: Paul Dale

Forum Jump: