A first estimate for Student's t quantile (not only) on the WP34s



#5

Hi all, hi Pauli. ;-)

over the last weeks there has been some discussion on the normal distribution and its inverse (quantile). I think the 34s performs pretty well now. The CDF should be exact and there also is a good first estimate for the quantile. After this, the Halley method returns results with fast convergence to the exact value. So far, so good.

The second important distribution we should take a closer look at now is Student's t-distribution. I have been experimenting a while with several approaches, and I finally got a result that I think is fine as a first estimate. Use at your own risk, expect any error you can imagine, but try this:

  • For df = 1 or 2 degrees of freedom the quantile can be determined directly.

    This also is the case for n = 4, it only requires a bit care at very low t (close to zero) or p close to 0,5.

  • For other degrees of freedom, i.e. df >= 3, the following method provides a nice first estimate. As usual, p is assumed to be less than 0,5 and t > 0.
       if -ln(p) < 1,7 * df then
    x = guess_normal(p)
    u = (x^3 + x) / 4 / df
    v = (x^5 / 12 + x^3 / 4) / df^2
    t = x + u + v
    else
    u = 2 * p * df * sqrt(Pi / (2*df - 1))
    t = sqrt(df) / u^(1/df)
    end if
Here guess_normal is the well-known first estimate of the normal quantile (with the slight modification 1/u^2 instead of 0,2/u, as mentioned earlier). According to my results this estimate for t differs from the exact value by something like typically +/- 1...2%, in some cases a bit more.

This single estimate now can be used for a Newton-iteration - the first derivative simply is the Student PDF, so we get

                   tCDF(t) - p
t_new = t - -----------
tPDF(t)
If two guesses are required (e.g. for the 34s solver) we can simply use 1.1*t and 0.9*t (or something similar).

And now: try it and let me know about your results. ;-)

Dieter

Edit: tail formula now looks slighty more elegant ;-)

Edited: 27 Apr 2011, 4:01 p.m. after one or more responses were posted


#6

We don't currently implement the t pdf and it looks pretty complex so I'd have to go with two bracketing estimates and use the built in solver. This is pretty what I'm doing now but the bracket is the t v=2 value and the upper normal estimate which is wider than I like.

When can we expect Chi^2 and F estimates ;-)


- Pauli


#7

Hi Pauli,

Quote:
We don't currently implement the t pdf and it looks pretty complex

It looks a bit complex because of the two Gamma functions. But there is a nice trick that was also used in the t-quantile estimate: The required Gamma quotient can be approximated with sufficient precision. ;-)
 Gamma((n+1)/2)       sqrt(2n - 1)
-------------- ~= ------------
Gamma(n/2) 2
I tried the Newton method both with the exact PDF and with a simplified substitute containing this approximation. The number of iterations required for two successive results agreeing in 14-15 significant digits was mostly 4 or 5 with the exact PDF and 5 to 7 using the approximation (slightly tweaked by adding 1/n^2 under the square root).
Quote:
so I'd have to go with two bracketing estimates and use the built in solver. This is pretty what I'm doing now but the bracket is the t v=2 value and the upper normal estimate which is wider than I like.

That's exactly why I proposed this new first estimate. ;-) If you want to use the solver, try a = 0.95 t and b = 1.05 t. The error of the estimate usually is less than 5% so that the true t-quantile can be expected in this interval. Of course an exact t-CDF is essential here for a precise quantile result.
Quote:
When can we expect Chi^2 and F estimates ;-)

First I need a benchmark algorithm that returns a sufficiently precise result. ;-)

I admit I finally was surprised how compact the final formulas for the t-estimate turned out. They are basically simplified (though improved) versions of two series expansions. The results I got actually were so good, I even detected an error in my reference algorithm far out in the distribution tails. ;-)

Dieter


#8

I've tried out the better initial estimate using the built in sovler. It is better by a couple of iterations across the bulk of the range and a few more in the extremes. Unfortunately, I don't think it is enough better to justify the space unfortunately. The additional code amounts to a bit under a kilobyte which is too much -- we've only got 3kb clear in total at the moment.

Likewise, I don't think specially implementing the pdf so we can use Newton's method will nett enough of a speed gain to justify the space cost.


What I have done is terminate the search when the relative error between estimates is 1e-24 or better, this saves a lot of iterations but should still give us decent accuracy when rounded to 16 digits. I've not timed this on a real unit yet but it should help most of the quantile functions.


- Pauli


#9

What? Just a couple of iterations saved? With the previous method the two initial guesses could be apart by a factor of 10, 100 or even more, and now it's merely 10%. Strange.

I tried the method with the approximate PDF. After the first iteration the error is order of 1E-3, and then each step gives about three more valid digits. Which matches the 5 - 7 iterations for 15 digits mentioned earlier. With the true PDF convergence is almost quadratic, as expected.

Pauli, I know you like results that are far, far more precise than required in the final 16-digit result, but a relative error of just 1E-24 in the t-approximation simply is too much. If two consecutive values differ by not more than 1E-24 this means that they match in 24 significant digits (+1 ULP) and already the previous(!) approximation was that precise. In other words, the current approximation has 27 or 30 or even more valid digits. But we need just 16... ;-)

Please do not forget: you can use as many digits as you want, but even a 100-digit result rounded to 16 digits has a potential error of 1 digit in the last place. It's only less likely. ;-) I would suggest to exit the iteration as the error drops below 1E-16. This way the last two approximations agree in all 16 digits (+1 ULP) and the current result even has some more correct digits. Too risky? Okay, then use 1E-20. ;-)

Dieter


#10

Quote:
What? Just a couple of iterations saved? With the previous method the two initial guesses could be apart by a factor of 10, 100 or even more, and now it's merely 10%. Strange.

I didn't exhaustively test thousands of cases but those I did showed modest improvement over most of the range and not a huge penalty at the extremes. Feel free to test lots more & let us know :-)

I do very much appreciate the efforts you've put into debugging and assisting with the improvements in the statistical functions.

The far more important factor here is space. The improved method uses about one third of the remaining flash space. At this stage, I'm unwilling to commit that much space to one distribution's inverse, especially when we're already getting accurate results albeit slowly. I'm sure there are plenty of bugs that cause inaccuracies and functions that are simply incorrect some of the time waiting to be discovered. We will need to fix those problems so I've always planned to have a small amount of buffer in the flash reserved for this.

If we had the space, your improved estimates would be in immediately. The code is in subversion and if we figure out a way to recover some space (& we've a few ideas floating around), in it will definitely go.


Quote:
With the true PDF convergence is almost quadratic, as expected.

The built in solver will achieve quadratic convergence almost always BTW. Brent's method is nice.


Quote:
Pauli, I know you like results that are far, far more precise than required in the final 16-digit result, but a relative error of just 1E-24 in the t-approximation simply is too much. If two consecutive values differ by not more than 1E-24 this means that they match in 24 significant digits (+1 ULP) and already the previous(!) approximation was that precise. In other words, the current approximation has 27 or 30 or even more valid digits. But we need just 16... ;-)

I like results that are likely to round correctly, more digits makes this more likely. For functions that are used by other functions (e.g. the normal QF) I like more digits again. I'm not a numerical analyst by profession so I'm deliberately applying a sledge hammer where it probably isn't required. Which would people prefer, slower but more accurate functions or faster and possibly incorrect?

Changing from 1e-24 to 1e-20 for the relative error will likely mean one iteration difference and possibly none for either method. I can live with that penalty for the extra security and comfort it brings :-)


- Pauli


#11

Okay, I now tried an implementation of Brent's method in VBA, using two inital guesses the same way it's done now (a = value for 2 d.o.f. and b = normal quantile). These results were compared to the approximations obtained with initial guesses based on the proposed method.

The results are quite exactly as expected: the current method does not require much more iterations as long as the two initial guesses are sufficiently close, i.e. in the center and not too far out in the tails. Values like p = 0.001 are still handled with only a few iterations more than required by the improved method.

Things change as soon as we get further out in the distribution tails. At n = 10, try p = 0.0001 or 1E-10 or even 1E-20. The number of iterations is twice or three times as high as with an optimized initial guess. It simply makes a difference whether the solver can work with two similar initial estimates or if it only knows the result is somewhere between 6 and 70000. ;-) If the calculation takes a second on a 2-GHz-system, imagine how long it will take on a real 20s or 30s. Maybe someone can try the following calculations on a re-flashed unit.

I tried the t-1(p) function on the emulator, version 1.17 673. Here are some results:

     p        n        exact result     WP34s result
-------------------------------------------------------------
1 E-5 10 -7,526954 -7,526954 so far
1 E-10 10 -25,466008 -25,466008 so good
1 E-15 10 -81,040891 -22360679
1 E-20 10 -256,434699 -7071067812 oops...
1 E-40 10 -2,564 E+05 -7,071 E+19
1 E-100 10 -2,564 E+10 "domain error"
I think it's obvious where this 7,071... mantissa comes from. ;-)

The t quantile function works fine in the center. Maybe not as fast as possible, but the results are okay. The more we get out in the tails of the t-distribution, the less reliable the results get. Below a certain limit they are far off.

I see the problem with the available RAM. But I think something has to be done to get at least correct results. Precision is not optional. ;-)

Dieter


#12

Quote:
I tried the t-1(p) function on the emulator, version 1.17 673.

That is a very old version of the software. Current is 793. The results I get aren't anything like what you're seeing. Not in a good way: 1E-15 is giving a domain error instead of a wrong result. I guess this is slightly better.

At these small values, t(x) seems like it might be wrong too.


Quote:
I see the problem with the available RAM. But I think something has to be done to get at least correct results. Precision is not optional.

Agreed, I'll look into this. The lack of space is in flash not RAM. Well it is in both, but this isn't requiring much RAM.

If I can't fix it, the domain gets reduced or the function removed :-(


- Pauli


#13

If I can vote, please reduce the domain rather than remove!

:-)


#14

Please add my vote to Gene's d:-)


#15

I'm working on the space issue. Maybe a combination of GCC 4.6.0 with some options used by CodeSourcery G++ (based on GCC 4.5.1) may help. At the moment, the latter produces tighter code then GCC 4.6.0 which puzzles me and which lets me hope that we can do even better.


#16

FSF GCC for the ARM always lags behind the development at CodeSourcery. CodeSourcery does send their improvements to the FSF, but it takes a while for them to get integrated into an FSF release. For ARM, using the latest CodeSourcery release is generally better in every way than using the FSF release.


#17

Being a contributor to GCC, I have a fair idea why this lag is occurring.... My changes were a couple of weeks' work, getting them accepted was far more painful.


- Pauli

#18

Marcus,

Are you compiling with -Os?

Have you guys considered/tried LLVM? In my experience (and following Apple's party line) you can expect both greater speed and smaller size for ARM (and IA) code.


#19

Quote:
Are you compiling with -Os?

We are but don't blindly trust -Os to reduce code size. On many platforms -O1 is small & often faster :-)

Gcc 4 for ARM, -Os seems to be the winner. For Gcc 3 for ARM it was -O1 almost always.


We've not tried llvm.


- Pauli

#20

I feel that domain reduction is the right way to go here.

Unlike the normal distribution which has uses as a function in its own right, the T-distribution is only used for statistical tests (as far as I'm aware -- let me know if I'm wrong here) and nobody in statistics really cares about probabilities less than 1e-10. Actually, a lot larger than this is sufficiently good :-)

Still, I'd like to fix the underlying problem and have it work globally. I was out most of today so didn't make any significant progress.

- Pauli


#21

I would not recommend a domain reduction for Student's distribution. There sure are other candidates that might get stripped somehow. Imagine a financial calculator accepting no values beyond 1E10 - "because you won't have that much money anyway". ;-)))

Also, according to my results there is no fixed lowest probability for a cutoff. The limit depends on the degrees of freedom. For instance, the switch between the two formulas for the improved initial guess is done at e-1,7 df. Beyond that limit, the guess is so close to the true value that at least this (rounded) estimate could be returned. But, as mentioned before, I think neither the domain nor the precision of the results should be reduced as far as the t-distribution is concerned.

Dieter

#22

Quote:
[Version 1.17 673 ] is a very old version of the software. Current is 793.

I looked at http://sourceforge.net/projects/wp34s/files/ and there was no newer version of the emulator than this, dated April 17. Where can I find a newer one? The emulator, that is. ;-)
Quote:
At these small values, t(x) seems like it might be wrong too.

That's my impression as well. Despite the extremely high internal precision there are lots of pitfalls as far as numeric accuracy is concerned. In my own programs for the 41C and 35s I often use different ways to evaluate a term, so that e.g. small and large values take different paths on their way through the algorithm. Even a simple term like 4*p*(1-p) is not trivial.
Quote:
If I can't fix it, the domain gets reduced or the function removed :-(

There are three destributions I would consider mandatory: Normal, Binomial and Student. Removing one of these, or even reducing their domain, is not an option, I think. There sure are other candidates that might get removed without major loss in functionality. If neccessary we could have another poll here. ;-)

Dieter


#23

Hallo Dieter,

look here for the newest emulator build. But watch out, this is closer to development, so more error prone than the "official" package at the address you mentioned.

Walter


#24

We're going to assemble a new official release early this week, I hope.

#25

Dieter,

Any keisan code for this? :-)


#26

Sure. Try this:

p = min(probability, 1-probability);   /* make sure p =< 0.5 */
s = sign(0.5 - probability); /* set correct sign */

if ((-ln(p)) < 1.7 * df)
{ /* first guess = normal quantile estimate */
if (p < 0.2)
{u=-2*ln(p); x=sqrt(-2*ln(p*sqrt((u-1)*2*pi))) + 1/u^2;}
else
{u = (0.5-p)*sqrt(2*pi); x = u + u^3/6;}
u = (x^3 + x) / 4 / df;
v = (x^5 / 12 + x^3 / 4) / df^2;
t = x + u + v;}
else
{u = 2*p*df*sqrt(pi/(2*df - 1));
t = sqrt(df) / u^(1/df);}

println(t*s,0); /* print first guess (and a dummy zero) */
k = 0;

do {
d = (tcd(t,df)-p)/tpd(t,df);
t = t + d;
err = d / t;
r = int(-log(abs(err)));
err = round(10^r * err, 3)/10^r; /* round error */
println(t*s, err); /* print current approximation and error */
k = k+1;
}
while(1 + abs(err)^2 <> 1); /* a somewhat different exit condition ;-) */
/* more conservative users may omit the square */
println(ticd(probability,df),k); /* print exact t-value and number of iterations */

This will generate a small table with the current approximation and the relative error, compared to the previous iteration. You can also nicely see the (well, almost) quadratic convergence.

However, the t-distribution will push Keisan close to its limits - and beyond. Try 1E-100 and 50 degrees of freedom. The iteration returns a result, while Keisan returns a "system error". ;-) Also, too much digits may cause Keisan-problems because of insufficient internal precision. So better try 18, 30 or 38 digits.

A real-world implementation would require a bit more, for instance a check for p = 0,5 where t = 0 (leading to a division by zero here).

Dieter


#27

Thank you very much, Dieter!

Here's my JavaScript "production" code, based on yours:

    "StudentTQuantile": function(df, x) {
if (df != Math.floor(df)) throw Error("bad arg");
if (x <= 0) return +Infinity; else if (x >= 1) return -Infinity; else if (x == 0.5) return 0;
var t, p = Math.min(x, 1-x), s = this.sign(0.5 - x);
if (-this.ln(p) < 1.7 * df) {
if (p < 0.2) {
var u = -2*this.ln(p);
x = Math.sqrt(-2*this.ln(p*Math.sqrt((u-1)*2*Math.PI))) + 1/(u*u);
}
else {
var u = (0.5-p)*Math.sqrt(2*Math.PI);
x = u + (u*u*u)/6;
}
var x3 = x*x*x;
u = (x3 + x) / 4 / df;
v = (x3*x*x / 12 + x3 / 4) / (df*df);
t = x + u + v;
}
else {
u = 2*p*df*Math.sqrt(Math.PI/(2*df - 1));
t = Math.sqrt(df) / Math.pow(u,(1/df));
}

var d, k = 10;
do {
d = (this.StudentTDistribution(df, t) - p) / this.StudentTPDF(df, t);
t += d;
}
while (d*d > 1e-30 && --k);

return t*s;
}

You'll notice the change in the termination criterion to get the result within one ULP (for IEEE754 doubles) but providing a hard iteration max to deal with too slow convergence and oscillations.

Here the results for your extreme values test:


     p        n           exact result        ND1 (max 10 iters)   ND1 (max 100 iters)
------------------------------------------------------------------------------------
1 E-5 10 7.526954019339 7.526954019333 7.526954019334
1 E-10 10 25.466008 25.466010 25.466008
1 E-15 10 81.040891 80.3924 81.05831
1 E-20 10 256.434699 -Infinity -Infinity
1 E-40 10 2.564 E+05 -Infinity -Infinity
1 E-100 10 2.564 E+10 19881486106.20 -Infinity

I guess a PDF at the mercy of the gamma function, may, in part, explain the slow convergence.

I'm going with the max of 10 iterations. For non-extreme values, results within +/- 1 ULP seem to be found in typically 3-5 iterations. The first estimate is great!

Do you have an iOS device? I'd love to give you a redeem code for a copy of ND1 (and CalcPad when it comes out).

Cheers.


#28

There must be a problem in the Student PDF or (more likely) the CDF. Here are the results evaluated by Keisan, set to 18 digits. All results were obtained after just 2 to 4 (!) iterations. Ten or even more iterations were not required. That's why I think the results of the CDF you use are not exact. But an exact CDF ist crucial for an exact quantile.

The initial estimate has a relative error order of 10^-2. So quadratic convergence should give a 15-digit result after three iterations, or let's say 4 or 5 under real world conditions. And that's exactly what I got. ;-)

    p        n           quantile           algorithm @ Keisan
-----------------------------------------------------------------
1 E-5 10 7.526954 7.52695401933965106
1 E-10 10 25.466008 25.4660080216977262
1 E-15 10 81.040891 81.040890880039345
1 E-20 10 256.434699 256.434699318526186
1 E-40 10 2.564 E+05 2.5645257010761472 E+05
1 E-100 10 2.564 E+10 2.5645257189481978 E+10
The results seem to be okay. Maybe someone can check them with Mathematica or Maple. I'll see what can be done in Excel/VBA (15 digit floating point arithmetics). Here, decent distribution functions and quantiles are sorely missing. Excel seems to have a very... "special" reputation among statisticians. ;-)

BTW - no, I do not have any kind of i-device. I'm a bit conservative in these things. ;-)

Dieter


#29

The algorithm is great and not to be blamed for slow convergence.

You were quite right: I found the CDF to be the culprit. For t=25.4, df=10, I'm only getting 5 good digits after the point.

The PDF, conversely, is exact.

I'll see how I can improve this, but as results seem to be quite good already, and convergence does occur for most values with 2 (corrected from 3, after testing) to 5 iterations, I'm quite happy with it already.

The CDF isn't entirely flawed. For t=25.4, df=2 I'm getting 16 good digits.


It's obvious you can do real math. So, yes, I can see how you don't need a permanent math i-companion. ;-)

#30

Quote:
Excel seems to have a very... "special" reputation among statisticians. ;-)

It has a similar reputation amongst numerical analysts.

Hopefully, I'll get back to the statistical code this week.


- Pauli

#31

Good news, the better initial estimate is back in (or will be when we build next). Some flash space was recovered.

I'm not using Newton's method however, the pdf will be large and we're going to need the space for estimates for the other distributions I expect.


- Pauli


#32

As far as I can see the max. error of the t quantile estimate is about 4%. So the initial interval could even be set to 0,95...1,05 t. ;-)

I was about to write this and that concerning numeric accuracy in the distributions, e.g. using ln1+x and e^x-1, and avoiding terms like p-0.5 where values below a certain limit simply return 0,5 (cf Logistic quantile). But I see much of this work already has been done now, e.g. in the Exponential and Weibull CDF, but not in its QF which should be updated the same way. There also is an improvement in the t-quantile for 1 d.o.f. where tan(pi*(p-0,5)) was replaced by cot(p*Pi), which I wanted to suggest as well. So it seems this time you were faster, Pauli. At least at some points. ;-)

The improved evaluation of the t-quantile (1 d.o.f.) should also to be applied to the Cauchy distribution. The latter could be omitted completely since it's nothing but the t-distribution with 1 d.o.f. Without Cauchy there also is no need to rework its cdf - since the final step is adding 0,5 there is a point beyond which the result is a plain zero.

I did not look at the t CDF. Does it return 16 valid digits for any t-value or is there something that can be improved?

By the way - how fast do these functions execute on a real 20s or 30s?

Dieter

#33

Pauli, I think there is a bug in the t-quantile routine: the solver seems to exit as soon as the result has reached display accuracy. Try p = 0,1 and 4 d.o.f. and you will get 1,5535 in FIX 2, while ALL returns the correct result 1,5532.

The Student CDF also does not work far out in the tails. Again, with four degrees of freedom, t = -1E10 returns 3E-40, and -1E20 should return 3E-80, but instead you get a plain zero. I assume the reason is some kind of digit cancellation: the limit is t = -sqrt(1E39). This also leads to substantial execution times for the quantile in that area, probably because more and more significant digits get lost as p drops below something like 1E-70. Values below 1E-78 simply throw a domain error.

Dieter

Edited: 7 May 2011, 4:57 p.m.


#34

Quote:
I think there is a bug in the t-quantile routine: the solver seems to exit as soon as the result has reached display accuracy. Try p = 0,1 and 4 d.o.f. and you will get 1,5535 in FIX 2, while ALL returns the correct result 1,5532.

Nice catch. Fixed soon.


- Pauli

#35

Quote:
The Student CDF also does not work far out in the tails. Again, with four degrees of freedom, t = -1E10 returns 3E-40, and -1E20 should return 3E-80, but instead you get a plain zero. I assume the reason is some kind of digit cancellation: the limit is t = -sqrt(1E39). This also leads to substantial execution times for the quantile in that area, probably because more and more significant digits get lost as p drops below something like 1E-70. Values below 1E-78 simply throw a domain error.

Not so much a digit cancellation as digit loss. The transform I used from the T cdf to the incomplete beta function involved a sqrt(v + t2) term which got uninteresting as |t| got large.

I've found an alternative transformation that looks like it works a lot better.

Now to see if my Internet connection will allow me to commit these two changes or not...

- Pauli

Edited: 7 May 2011, 9:23 p.m.


Possibly Related Threads…
Thread Author Replies Views Last Post
  [OT] engineering student chalenge cyrille de BrĂ©bisson 1 1,335 11-25-2013, 10:06 PM
Last Post: FORTIN Pascal
  [WP34S] WP34S firmware on the AT91SAM7L-STK dev kit? jerome ibanes 1 1,232 10-04-2012, 04:59 PM
Last Post: Paul Dale
  Yet another Normal quantile function for the 34s Dieter 13 3,686 08-06-2012, 09:15 PM
Last Post: Paul Dale
  Summary: Normal CDF and quantile function Dieter 3 1,447 05-13-2012, 02:20 AM
Last Post: Paul Dale
  [wp34s] Incomplete Gamma on the wp34s Les Wright 18 5,260 12-06-2011, 11:07 AM
Last Post: Namir
  [wp34s] Romberg Integration on the wp34s Les Wright 2 1,511 12-04-2011, 10:49 PM
Last Post: Les Wright
  A fast and compact algorithm for the normal quantile Dieter 13 3,434 04-22-2011, 07:11 PM
Last Post: Paul Dale
  HP 33S - Chemistry Student Needs Programs Rhyleigh 3 1,263 10-27-2007, 01:20 AM
Last Post: Ed Look
  Runge-Kutta-Verner method with error-estimate Jean-Michel 2 1,120 06-04-2007, 05:37 PM
Last Post: JMBaillard
  Roll call from CURRENT student HP users: ECL 14 3,608 11-04-2005, 04:13 PM
Last Post: Sean

Forum Jump: