A fast and compact algorithm for the normal quantile



#2

In earlier threads in this forum various improvements have been discussed concerning the most efficient way to evaluate the inverse normal distribution (quantile) on the 34s. Since this might be of some interest for other calculators and/or implementations I think it's okay to start a new topic here.

As usual, it's a tradeoff between memory usage, execution speed and numeric accuracy. The current implementation starts with two quite good initial guesses - one value slightly high, the other a bit low. Then the solver is called to determine the exact result.

We can do better. ;-)

I have tried the one or other approach, and finally it turned out that there is a simple method that converges very fast and might even use less memory than the current method. The very sophisticated 34s solver is not required here. There is a much simpler, yet effective way to determine the quantile with a different algorithm:

First we guess a good estimate for the quantile. Only one single value is required, for instance this one:

           p > 0,2                   p < 0,2
--------------------- --------------------
u = sqrt(2*pi)*(0,5-p) u = -2 * ln p
x = u + u^3 / 6 x = sqrt(-2 * ln(p * sqrt((u-1)*2*pi))) + 0,2/u
Yes, the second term is a bit more complex than before but it's worth it.

After this, a dedicated though simple solver is used. The preferred
algorithm here is the Halley method, which uses both the first and the second derivative of a function. In our case, the first derivative simply is the PDF, and the second derivative is a simple function thereof. This leads to a very compact form of the Halley equation:

Assuming x > 0 and p < 0.5, first the well-known Newton-Raphson quotient f / f' is evaluated:

             cdf(x) - p
t = ----------
pdf(x)
The pdf already has been evaluated within the cdf routine, so its value is known and no additional calculation is required. Now the new and improved approximation is:
                     t
x_new = x + ---------
1 - t*x/2

Look, Pauli - no slow logs required here. ;-)

Okay, how does it perform? I tried this method in Excel/VBA with 15
significant digits. After just two (!) iterations the final result was achieved. The method seems to converge at least quadratic, theoretically it's even cubic. So a third iteration probably should return something like 30 valid digits.

Pauli, what do you think? This should be faster than before and maybe it even requires less memory since no equation for the solver has to be set up. And only three (hard-coded) iterations should return a result with far more valid digits than actually required.

Dieter

(Edit: corrected error in first guess formula)

Edited: 22 Apr 2011, 6:06 a.m. after one or more responses were posted


#3

can you check your formula,

set p = 0.97, and p > 0.2, so
u = -2*ln(p) = 0.0609184149694170918385225
then (u-1)*2*pi = -5.900423617307076027606582

which does not sqrt.

i'm trying a little LUA program of your idea...

sqrt = math.sqrt
pi = math.pi
ln = math.ln
norm = math.norm
exp = math.exp

function pdf(x)
return exp(-x*x/2)/sqrt(2*pi)
end

function invQ(p)

local x, u
local t, x2;

-- initial values for iteration
if p < 0.2 then
u = sqrt(2*pi)*(0.5-p)
x = u + u^3/6
else
u = -2*ln(p)
x = sqrt(-2*ln(p * sqrt((u-1)*2*pi))) + 0.2/u
end

while true do
t = (norm(x) - p)/pdf(x)
x2 = x + t/(1-t*x/2)
--print(x2)
if x2 == x then break end
x = x2
end
return x
end


#4

You can also try the following empirical formula that I have obtained using one of my best-curve fitting programs:

Qinv = x/(0.063493571 - 0.198675274*x^2 - 0.017195871 * x^3)^(1/3)

Where x = 0.5 - alpha, Qinv is the inverse normal distribution, and alpha is the significance level in decimals.


Edited: 21 Apr 2011, 9:46 p.m.

#5

Hugh, the algorithm is based on the assumption that p =< 0,5 and the result x is positive. This means that - as usual in approximations of the normal quantile - we only consider one half of the bell curve and the other half is obtained by simple symmetry.

In your case, simply replace p = 0,97 by p = 1 - 0,97 = 0,03 and finally adjust the sign of the result. In this case the first estimate for x is 1,867 (exact value: 1,881).

In other words and pseudo code:

   signflag = false
 
if p > 0.5 then
p = 1-p
signflag = true
end if
 
... (calculation)
 
if signflag then x = -x

However, there actually was an error in the formulas for the first estimate - I simply confused both cases (less or greater than 0,2). I just corrected this. The approximation x = u + u^3/6 is used in the center, i.e. 0,2 < p =< 0,5 - sorry for the mistake. ;-)

Your LUA program looks interesting. However, be sure that norm(x) returns a value < 0.5 for positive x. If norm(2) returns 0.9775 instead of 0.0225, simply use norm(-x) here. Do you have a sample output of your program?

Dieter


Edited: 22 Apr 2011, 6:35 a.m.

#6

Quote:
Pauli, what do you think? This should be faster than before and maybe it even requires less memory since no equation for the solver has to be set up. And only three (hard-coded) iterations should return a result with far more valid digits than actually required.

I'm tempted to give it a try, the built in solver should give quadratic convergence but this might be better and in avoids the slow logs. It won't be smaller however, the arithmetic to set things up is more involved.

There isn't a solver function for these inverses, I just use the cdf directly for most of them.


- Pauli

#7

The worst I've seen so far is five iterations to get to a relative error of 1e-32. Going to complete convergence takes a lot more -- I'm suspecting there is some oscillating going on in the last digit.
Best case I've seen is convergence to that degree of relative error is one iteration. Seems to average four or five. Still it is better than what was there and it should be a lot faster.


- Pauli


#8

And it is in version 726 :-)


- Pauli

#9

Pauli, thanks for trying. ;-)

Quote:
The worst I've seen so far is five iterations to get to a relative error of 1e-32.

The algorithm should give at least quadratic, maybe even cubic convergence. If the exit condition is based on the difference of the last two approximations the result has more accuracy than required. A typical example: the new guess and the previous one differ by 1E-7. But actually the new result already has 15 correct digits now, so there is no need for another iteration.

For example, try p = 0,1:

                                      current approx.   actual
estimate for x minus previous error
-------------------------------------------------------------
first guess: 1,2620 04083 42513 . 1,95 E-2
1st approx.: 1,2815 49321 97319 1,96 E-2 2,24 E-6
2nd approx.: 1,2815 51565 54460 2,24 E-6 0,00
So, after two iterations the result has not just five valid decimals, but all 15 digits are correct.
Quote:
Going to complete convergence takes a lot more -- I'm suspecting there is some oscillating going on in the last digit.
Best case I've seen is convergence to that degree of relative error is one iteration. Seems to average four or five.

The algorithm assumes x is greater than zero and p less than 0,5. Just an idea: try 1 + t*x/2 in the denominator. Does it converge faster or slower now?

Dieter

Edited: 22 Apr 2011, 8:09 a.m.


#10

Quote:
The algorithm assumes x is greater than zero and p less than 0,5. Just an idea: try 1 + t*x/2 in the denominator. Does it converge faster or slower now?

Much slower (assuming I got the code change correct which seems likely).


The algorithm seems to work for all values of p converging at the same rate for p>0.5 as for 1-p. Care has to be taken with the initial estimates since they do depend on p<0.5.

I know that the final iteration is to recognise that the previous iteration converged, I'd prefer to wear this cost than have a pathological case turn up where convergence isn't achieved in four iterations (I'm limiting the maximum iteration count to 10 regardless, at least until someone finds an accuracy problem). As usual, I'm also going well past the 16 digits we return to the user.


Anyway, I feel the new algorithm is a significant improvement over what was there previously.


- Pauli


#11

Quote:
Much slower (assuming I got the code change correct which seems likely).

Fine, so your original code was correct.

The algorithm indeed assumes p < 0.5 and the sign of the result is adjusted in a final step. However, after your results I had to check the original claim that three iterations should be sufficient for up to 30 digits. ;-) With 42 digits working precision the results look like this (initial guess rounded to 10 significant digits):

                             40 digits limit ---+
p = 0.49999 |
2.5066 28275 E-5 |
2.5066 28274 89349 40015 68864 28261 76743 07333 31 E-5
2.5066 28274 89349 40015 68864 28261 76743 07333 30 E-5
2.5066 28274 89349 40015 68864 28261 76743 07333 30 E-5
2.5066 28274 89349 40015 68864 28261 76743 07333 30 E-5

p = 0.4
.25328 77625
.25334 71031 35763 85539 77556 28100 57078 81760 39
.25334 71031 35799 79879 81961 81424 24393 87872 03
.25334 71031 35799 79879 81961 81424 24393 87872 11
.25334 71031 35799 79879 81961 81424 24393 87872 11

p = 0.2
.82286 17271
.84161 97566 33296 85072 34697 72160 98145 07910 22
.84162 12335 72914 20445 15834 89086 50920 52227 84
.84162 12335 72914 20517 87061 21363 24810 06262 98
.84162 12335 72914 20517 87061 21363 24810 06262 98

p = 0.1
1.2620 04083
1.2815 49321 97304 25441 86934 96100 93831 29653 00
1.2815 51565 54460 04635 37247 05905 12098 06891 60
1.2815 51565 54460 04669 65103 32944 87428 18619 91
1.2815 51565 54460 04669 65103 32944 87428 18619 91

p = 0.05
1.6289 55362
1.6448 52064 50501 31885 89723 12911 91114 79820 87
1.6448 53626 95147 27133 68151 91925 65119 10851 86
1.6448 53626 95147 27148 63848 90799 16321 36083 20
1.6448 53626 95147 27148 63848 90799 16321 36083 20

p = 0.01
2.3167 24294
2.3263 47326 06755 23743 60871 45678 93555 71274 95
2.3263 47874 04084 11007 83975 20190 86122 37531 92
2.3263 47874 04084 11008 85606 16334 69117 23351 82
2.3263 47874 04084 11008 85606 16334 69117 23351 82

p = 0.001
3.0848 14785
3.0902 32153 47247 09318 54741 57117 90091 26631 30
3.0902 32306 16781 35415 36973 25208 17996 59903 08
3.0902 32306 16781 35415 40399 83010 73792 05491 01
3.0902 32306 16781 35415 40399 83010 73792 05491 01

p = 0.00001
4.2625 96685
4.2648 90773 62443 34415 61811 69034 21338 64748 90
4.2648 90793 92282 46284 98510 62791 40211 75093 32
4.2648 90793 92282 46284 98524 69890 63446 29356 05
4.2648 90793 92282 46284 98524 69890 63446 29356 05

p = 1E-10
6.3609 15452
6.3613 40902 13155 28775 83723 50960 51571 40311 68
6.3613 40902 40405 62046 95375 82819 36102 62300 27
6.3613 40902 40405 62046 95375 82826 52216 79203 94
6.3613 40902 40405 62046 95375 82826 52216 79203 94

p = 1E-100
21.273 63259
21.273 45356 11826 87004 95639 46564 76422 25829 69
21.273 45356 09653 24295 11721 21890 51240 14325 95
21.273 45356 09653 24295 11721 21886 62226 41864 88
21.273 45356 09653 24295 11721 21886 62226 41864 88

p = 1E-500
47.885 42765
47.885 37067 38888 34080 50406 93884 64596 70554 77
47.885 37067 38534 60195 01196 55452 81331 03683 06
47.885 37067 38534 60195 01196 55452 72865 56065 64
47.885 37067 38534 60195 01196 55452 72865 56065 64

Okay, I stand corrected - it obviously looks more like 40 digits. ;-)

Dieter


Edited: 22 Apr 2011, 9:51 a.m.


#12

Actually it's something near 50 digits. ;-)

If you don't want to look at endless tables with 50-digit numbers, simply try it yourself: I already mentioned Casios Keisan website in an earlier post. There also is a nice programmable calculator that we can use to test-drive the algorithm:

Visit the Keisan calculator and set it to 50 digits. Pauli may use 38 digits instead. ;-) Clear the "Expression" form and insert the following code:

/***********************************************/
/* */
/* Halley's method for the normal quantile */
/* */
/***********************************************/

p = min(probability, 1-probability); /* make sure p =< 0.5 */
s = sign(0.5 - probability); /* set correct sign */

if (p < 0.2) /* make a first guess */
{u=-2*ln(p); z=sqrt(-2*ln(p*sqrt((u-1)*2*pi))) + 1/u^2;}
else
{u = (0.5-p)*sqrt(2*pi); z = u + u^3/6;}

print(z*s); /* print first guess */

k = 1;

do {
t = (normalcd(z)-p)/normalpd(z);
z = z + t/(1 - t*z/2);
print(z*s); /* print current approximation */
k = k+1;
}
while(k=<3); /* three iterations should do */

print(normalicd(probability)); /* compare with exact z-value */

Press "Execute" and an input control labelled "probability" will appear.

Now enter a probability and press "Execute". Five lines of data will be printed. The first one is the initial guess, followed by three Halley-iterations, and the last line is the correct value as returned by the system. If you wish, the output labels ("ans1" ... "ans5") can be edited (double-click).

Dieter


#13

Hi Dieter,

This is *very* nice. Thank you for sharing this, and the beautiful development!

I'm adopting it in MorphEngine (ND1, CalcPad), which lacked all quantiles.

Here's how your code looks as a JavaScript user function in ND1:

function(probability) {
var p = min(probability, 1-probability); /* make sure p =< 0.5 */
var s = sign(0.5 - probability); /* set correct sign */

if (p < 0.2) { /* make a first guess */
var u=-2*ln(p);
var z=sqrt(-2*ln(p*sqrt((u-1)*2*pi))) + 1/(u*u);
}
else {
var u = (0.5-p)*sqrt(2*pi);
var z = u + (u*u*u)/6;
}

calculator.stack.push(z*s); /* print first guess */

var k = 1;

do {
var t = (NormalDistribution(0, 1, z)-p)/NormalPDF(0, 1, z);
z = z + t/(1 - t*z/2);
calculator.stack.push(z*s); /* print current approximation */
k = k+1;
}
while(k <= 3); /* three iterations should do */

return z;
}

(Notice, this is almost identical to your keisan code. Btw, I was surprised that keisan accepts =< as "less or equal than". The calculator.stack.push() statements above would be removed after debugging.)

Btw, one can write code like this on a desktop and automatically import into the calculator via WiFi, by "visiting" the variable under which the function is stored. (See Get and display remote code with mod'ed VISIT key )

I added returns of +Infinity and -Infinity for out-of-domain inputs and plan to look into making the code work with add'l mean and variance args for compatibility to UTPN and NDIST. (A first quick pass at that wasn't successful.)

I was surprised to see that on hp 50g one cannot use NDIST (or any of the distributions) in an expression for use in graphing (I'm getting "INVALID SYNTAX").

On ND1 this works, and I can report that NormalQuantile() runs quick enough to be graphed and zoomed/panned in real-time. (An advantage of returning Infinities over "out-of-domain" error would also be that the user can still graph the function and visually discover the defined domain.)

Thanks again.

#14

Quote:
The algorithm seems to work for all values of p converging at the same rate for p>0.5 as for 1-p.

Still I would suggest to use "the usual approach" here, i.e. use p or 1 - p and finally adjust the sign of the result. Otherwise for p close to 1 accuracy is wasted. Okay, with 39 digits you can look at this quite relaxed. ;-)
Quote:
(I'm limiting the maximum iteration count to 10 regardless, at least until someone finds an accuracy problem). As usual, I'm also going well past the 16 digits we return to the user.

I agree that there should be a few "guard digits", at least for the programmer's and user's peace of mind. In the olden days, HP was satisfied with two or three internal digits more than those displayed to the user. ;-) I have yet to find a case where the second iteration after the initial guess did not return a result with an error less than one unit of the 18th significant digit, and the third iteration returned at least 42 valid digits. If (!) the CDF is evaluated to full machine precision, that is. I assume the CDF algorithm will return results with slight errors beyond the 35th digit, but I do not think this is a problem here.

There still may be an issue we have not discussed yet: how does the 34s behave as the lower working limit is approached, i.e. at probablities like 1E-383...1E-399. Here numbers start getting subnormal. I think it's a good idea if in this range the first estimate is a little bit low (in absolute terms), so that the CDF does not underflow. This can be achieved with an alternative way of calculating (cdf(x)-p)/pdf(x) where the pdf can be cancelled out, or with a slight adjustment of the initial guess far out in the tail: I would recommend to replace the 0,2/u term in the current formula by a simple 1/u^2. This way the approximation is slightly low here (instead of slighty high). At least down to p = 1E-4000. ;-)

The first estimate for larger x still is a bit complex and requires two roots and two logs. Do you think it should be replaced by a (tiny ;-)) rational approximation with four three-digit constants? I think this might be faster, and the error of the first estimate would be uniformly distributed within something like +/- 0,002. But I cannot say if this would require more or less memory than now.

Dieter


#15

Quote:
In the olden days, HP was satisfied with two or three internal digits more than those displayed to the user. ;-)

They had William Kahan doing their algorithm design.

I'm not even close to his level of expertise.

Very few if any are :-)


- Pauli


Possibly Related Threads...
Thread Author Replies Views Last Post
  A fast Bernoulli Number method for the HP Prime Namir 16 1,117 11-22-2013, 04:46 PM
Last Post: Namir
  Very fast modified TEA for HP 48 and up! Raymond Del Tondo 0 191 11-23-2012, 08:43 PM
Last Post: Raymond Del Tondo
  Question about a "combinations" algorithm Namir 9 542 09-20-2012, 04:51 PM
Last Post: Gilles Carpentier
  Linear Programming - Simplex Algorithm LarryLion 5 336 09-04-2012, 10:57 PM
Last Post: David Hayden
  Yet another Normal quantile function for the 34s Dieter 13 650 08-06-2012, 09:15 PM
Last Post: Paul Dale
  Fast Quadratic Formula for the HP-41C Gerson W. Barbosa 21 1,382 07-18-2012, 08:53 AM
Last Post: Gerson W. Barbosa
  Summary: Normal CDF and quantile function Dieter 3 277 05-13-2012, 02:20 AM
Last Post: Paul Dale
  [WP34S] Curious Bug in Inverse Normal Function Les Wright 61 2,016 05-01-2012, 02:44 AM
Last Post: Paul Dale
  32E's Normal & Inverse Normal Distribution Matt Agajanian 27 1,403 04-14-2012, 06:07 PM
Last Post: Dieter
  HP 32sII Integration Error of Standard Normal Curve Anthony (USA) 4 372 03-14-2012, 03:18 AM
Last Post: Nick_S

Forum Jump: