▼
Posts: 653
Threads: 26
Joined: Aug 2010
Hi all, hi Pauli. ;)
over the last weeks there has been some discussion on the normal distribution and its inverse (quantile). I think the 34s performs pretty well now. The CDF should be exact and there also is a good first estimate for the quantile. After this, the Halley method returns results with fast convergence to the exact value. So far, so good.
The second important distribution we should take a closer look at now is Student's tdistribution. I have been experimenting a while with several approaches, and I finally got a result that I think is fine as a first estimate. Use at your own risk, expect any error you can imagine, but try this:
Here guess_normal is the wellknown first estimate of the normal quantile (with the slight modification 1/u^2 instead of 0,2/u, as mentioned earlier). According to my results this estimate for t differs from the exact value by something like typically +/ 1...2%, in some cases a bit more.
This single estimate now can be used for a Newtoniteration  the first derivative simply is the Student PDF, so we get
tCDF(t)  p
t_new = t  
tPDF(t)
If two guesses are required (e.g. for the 34s solver) we can simply use 1.1*t and 0.9*t (or something similar).
And now: try it and let me know about your results. ;)
Dieter
Edit: tail formula now looks slighty more elegant ;)
Edited: 27 Apr 2011, 4:01 p.m. after one or more responses were posted
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
We don't currently implement the t pdf and it looks pretty complex so I'd have to go with two bracketing estimates and use the built in solver. This is pretty what I'm doing now but the bracket is the t v=2 value and the upper normal estimate which is wider than I like.
When can we expect Chi^2 and F estimates ;)
 Pauli
▼
Posts: 653
Threads: 26
Joined: Aug 2010
Hi Pauli,
Quote:
We don't currently implement the t pdf and it looks pretty complex
It looks a bit complex because of the two Gamma functions. But there is a nice trick that was also used in the tquantile estimate: The required Gamma quotient can be approximated with sufficient precision. ;)
Gamma((n+1)/2) sqrt(2n  1)
 ~= 
Gamma(n/2) 2
I tried the Newton method both with the exact PDF and with a simplified substitute containing this approximation. The number of iterations required for two successive results agreeing in 1415 significant digits was mostly 4 or 5 with the exact PDF and 5 to 7 using the approximation (slightly tweaked by adding 1/n^2 under the square root).
Quote:
so I'd have to go with two bracketing estimates and use the built in solver. This is pretty what I'm doing now but the bracket is the t v=2 value and the upper normal estimate which is wider than I like.
That's exactly why I proposed this new first estimate. ;) If you want to use the solver, try a = 0.95 t and b = 1.05 t. The error of the estimate usually is less than 5% so that the true tquantile can be expected in this interval. Of course an exact tCDF is essential here for a precise quantile result.
Quote:
When can we expect Chi^2 and F estimates ;)
First I need a benchmark algorithm that returns a sufficiently precise result. ;)
I admit I finally was surprised how compact the final formulas for the testimate turned out. They are basically simplified (though improved) versions of two series expansions. The results I got actually were so good, I even detected an error in my reference algorithm far out in the distribution tails. ;)
Dieter
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
I've tried out the better initial estimate using the built in sovler. It is better by a couple of iterations across the bulk of the range and a few more in the extremes. Unfortunately, I don't think it is enough better to justify the space unfortunately. The additional code amounts to a bit under a kilobyte which is too much  we've only got 3kb clear in total at the moment.
Likewise, I don't think specially implementing the pdf so we can use Newton's method will nett enough of a speed gain to justify the space cost.
What I have done is terminate the search when the relative error between estimates is 1e24 or better, this saves a lot of iterations but should still give us decent accuracy when rounded to 16 digits. I've not timed this on a real unit yet but it should help most of the quantile functions.
 Pauli
▼
Posts: 653
Threads: 26
Joined: Aug 2010
What? Just a couple of iterations saved? With the previous method the two initial guesses could be apart by a factor of 10, 100 or even more, and now it's merely 10%. Strange.
I tried the method with the approximate PDF. After the first iteration the error is order of 1E3, and then each step gives about three more valid digits. Which matches the 5  7 iterations for 15 digits mentioned earlier. With the true PDF convergence is almost quadratic, as expected.
Pauli, I know you like results that are far, far more precise than required in the final 16digit result, but a relative error of just 1E24 in the tapproximation simply is too much. If two consecutive values differ by not more than 1E24 this means that they match in 24 significant digits (+1 ULP) and already the previous(!) approximation was that precise. In other words, the current approximation has 27 or 30 or even more valid digits. But we need just 16... ;)
Please do not forget: you can use as many digits as you want, but even a 100digit result rounded to 16 digits has a potential error of 1 digit in the last place. It's only less likely. ;) I would suggest to exit the iteration as the error drops below 1E16. This way the last two approximations agree in all 16 digits (+1 ULP) and the current result even has some more correct digits. Too risky? Okay, then use 1E20. ;)
Dieter
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: What? Just a couple of iterations saved? With the previous method the two initial guesses could be apart by a factor of 10, 100 or even more, and now it's merely 10%. Strange.
I didn't exhaustively test thousands of cases but those I did showed modest improvement over most of the range and not a huge penalty at the extremes. Feel free to test lots more & let us know :)
I do very much appreciate the efforts you've put into debugging and assisting with the improvements in the statistical functions.
The far more important factor here is space. The improved method uses about one third of the remaining flash space. At this stage, I'm unwilling to commit that much space to one distribution's inverse, especially when we're already getting accurate results albeit slowly. I'm sure there are plenty of bugs that cause inaccuracies and functions that are simply incorrect some of the time waiting to be discovered. We will need to fix those problems so I've always planned to have a small amount of buffer in the flash reserved for this.
If we had the space, your improved estimates would be in immediately. The code is in subversion and if we figure out a way to recover some space (& we've a few ideas floating around), in it will definitely go.
Quote: With the true PDF convergence is almost quadratic, as expected.
The built in solver will achieve quadratic convergence almost always BTW. Brent's method is nice.
Quote:
Pauli, I know you like results that are far, far more precise than required in the final 16digit result, but a relative error of just 1E24 in the tapproximation simply is too much. If two consecutive values differ by not more than 1E24 this means that they match in 24 significant digits (+1 ULP) and already the previous(!) approximation was that precise. In other words, the current approximation has 27 or 30 or even more valid digits. But we need just 16... ;)
I like results that are likely to round correctly, more digits makes this more likely. For functions that are used by other functions (e.g. the normal QF) I like more digits again. I'm not a numerical analyst by profession so I'm deliberately applying a sledge hammer where it probably isn't required. Which would people prefer, slower but more accurate functions or faster and possibly incorrect?
Changing from 1e24 to 1e20 for the relative error will likely mean one iteration difference and possibly none for either method. I can live with that penalty for the extra security and comfort it brings :)
 Pauli
▼
Posts: 653
Threads: 26
Joined: Aug 2010
Okay, I now tried an implementation of Brent's method in VBA, using two inital guesses the same way it's done now (a = value for 2 d.o.f. and b = normal quantile). These results were compared to the approximations obtained with initial guesses based on the proposed method.
The results are quite exactly as expected: the current method does not require much more iterations as long as the two initial guesses are sufficiently close, i.e. in the center and not too far out in the tails. Values like p = 0.001 are still handled with only a few iterations more than required by the improved method.
Things change as soon as we get further out in the distribution tails. At n = 10, try p = 0.0001 or 1E10 or even 1E20. The number of iterations is twice or three times as high as with an optimized initial guess. It simply makes a difference whether the solver can work with two similar initial estimates or if it only knows the result is somewhere between 6 and 70000. ;) If the calculation takes a second on a 2GHzsystem, imagine how long it will take on a real 20s or 30s. Maybe someone can try the following calculations on a reflashed unit.
I tried the t^{1}(p) function on the emulator, version 1.17 673. Here are some results:
p n exact result WP34s result

1 E5 10 7,526954 7,526954 so far
1 E10 10 25,466008 25,466008 so good
1 E15 10 81,040891 22360679
1 E20 10 256,434699 7071067812 oops...
1 E40 10 2,564 E+05 7,071 E+19
1 E100 10 2,564 E+10 "domain error"
I think it's obvious where this 7,071... mantissa comes from. ;)
The t quantile function works fine in the center. Maybe not as fast as possible, but the results are okay. The more we get out in the tails of the tdistribution, the less reliable the results get. Below a certain limit they are far off.
I see the problem with the available RAM. But I think something has to be done to get at least correct results. Precision is not optional. ;)
Dieter
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: I tried the t^{1}(p) function on the emulator, version 1.17 673.
That is a very old version of the software. Current is 793. The results I get aren't anything like what you're seeing. Not in a good way: 1E15 is giving a domain error instead of a wrong result. I guess this is slightly better.
At these small values, t(x) seems like it might be wrong too.
Quote: I see the problem with the available RAM. But I think something has to be done to get at least correct results. Precision is not optional.
Agreed, I'll look into this. The lack of space is in flash not RAM. Well it is in both, but this isn't requiring much RAM.
If I can't fix it, the domain gets reduced or the function removed :(
 Pauli
▼
Posts: 1,545
Threads: 168
Joined: Jul 2005
If I can vote, please reduce the domain rather than remove!
:)
▼
Posts: 4,587
Threads: 105
Joined: Jul 2005
Please add my vote to Gene's d:)
▼
Posts: 3,283
Threads: 104
Joined: Jul 2005
I'm working on the space issue. Maybe a combination of GCC 4.6.0 with some options used by CodeSourcery G++ (based on GCC 4.5.1) may help. At the moment, the latter produces tighter code then GCC 4.6.0 which puzzles me and which lets me hope that we can do even better.
▼
Posts: 2,309
Threads: 116
Joined: Jun 2005
FSF GCC for the ARM always lags behind the development at CodeSourcery. CodeSourcery does send their improvements to the FSF, but it takes a while for them to get integrated into an FSF release. For ARM, using the latest CodeSourcery release is generally better in every way than using the FSF release.
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Being a contributor to GCC, I have a fair idea why this lag is occurring.... My changes were a couple of weeks' work, getting them accepted was far more painful.
 Pauli
Posts: 239
Threads: 9
Joined: Dec 2010
Marcus,
Are you compiling with Os?
Have you guys considered/tried LLVM? In my experience (and following Apple's party line) you can expect both greater speed and smaller size for ARM (and IA) code.
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: Are you compiling with Os?
We are but don't blindly trust Os to reduce code size. On many platforms O1 is small & often faster :)
Gcc 4 for ARM, Os seems to be the winner. For Gcc 3 for ARM it was O1 almost always.
We've not tried llvm.
 Pauli
Posts: 3,229
Threads: 42
Joined: Jul 2006
I feel that domain reduction is the right way to go here.
Unlike the normal distribution which has uses as a function in its own right, the Tdistribution is only used for statistical tests (as far as I'm aware  let me know if I'm wrong here) and nobody in statistics really cares about probabilities less than 1e10. Actually, a lot larger than this is sufficiently good :)
Still, I'd like to fix the underlying problem and have it work globally. I was out most of today so didn't make any significant progress.
 Pauli
▼
Posts: 653
Threads: 26
Joined: Aug 2010
I would not recommend a domain reduction for Student's distribution. There sure are other candidates that might get stripped somehow. Imagine a financial calculator accepting no values beyond 1E10  "because you won't have that much money anyway". ;)))
Also, according to my results there is no fixed lowest probability for a cutoff. The limit depends on the degrees of freedom. For instance, the switch between the two formulas for the improved initial guess is done at e^{1,7 df}. Beyond that limit, the guess is so close to the true value that at least this (rounded) estimate could be returned. But, as mentioned before, I think neither the domain nor the precision of the results should be reduced as far as the tdistribution is concerned.
Dieter
Posts: 653
Threads: 26
Joined: Aug 2010
Quote:
[Version 1.17 673 ] is a very old version of the software. Current is 793.
I looked at http://sourceforge.net/projects/wp34s/files/ and there was no newer version of the emulator than this, dated April 17. Where can I find a newer one? The emulator, that is. ;)
Quote:
At these small values, t(x) seems like it might be wrong too.
That's my impression as well. Despite the extremely high internal precision there are lots of pitfalls as far as numeric accuracy is concerned. In my own programs for the 41C and 35s I often use different ways to evaluate a term, so that e.g. small and large values take different paths on their way through the algorithm. Even a simple term like 4*p*(1p) is not trivial.
Quote:
If I can't fix it, the domain gets reduced or the function removed :(
There are three destributions I would consider mandatory: Normal, Binomial and Student. Removing one of these, or even reducing their domain, is not an option, I think. There sure are other candidates that might get removed without major loss in functionality. If neccessary we could have another poll here. ;)
Dieter
▼
Posts: 4,587
Threads: 105
Joined: Jul 2005
Hallo Dieter,
look here for the newest emulator build. But watch out, this is closer to development, so more error prone than the "official" package at the address you mentioned.
Walter
▼
Posts: 3,283
Threads: 104
Joined: Jul 2005
We're going to assemble a new official release early this week, I hope.
Posts: 239
Threads: 9
Joined: Dec 2010
Dieter,
Any keisan code for this? :)
▼
Posts: 653
Threads: 26
Joined: Aug 2010
Sure. Try this:
p = min(probability, 1probability); /* make sure p =< 0.5 */
s = sign(0.5  probability); /* set correct sign */
if ((ln(p)) < 1.7 * df)
{ /* first guess = normal quantile estimate */
if (p < 0.2)
{u=2*ln(p); x=sqrt(2*ln(p*sqrt((u1)*2*pi))) + 1/u^2;}
else
{u = (0.5p)*sqrt(2*pi); x = u + u^3/6;}
u = (x^3 + x) / 4 / df;
v = (x^5 / 12 + x^3 / 4) / df^2;
t = x + u + v;}
else
{u = 2*p*df*sqrt(pi/(2*df  1));
t = sqrt(df) / u^(1/df);}
println(t*s,0); /* print first guess (and a dummy zero) */
k = 0;
do {
d = (tcd(t,df)p)/tpd(t,df);
t = t + d;
err = d / t;
r = int(log(abs(err)));
err = round(10^r * err, 3)/10^r; /* round error */
println(t*s, err); /* print current approximation and error */
k = k+1;
}
while(1 + abs(err)^2 <> 1); /* a somewhat different exit condition ;) */
/* more conservative users may omit the square */
println(ticd(probability,df),k); /* print exact tvalue and number of iterations */
This will generate a small table with the current approximation and the relative error, compared to the previous iteration. You can also nicely see the (well, almost) quadratic convergence.
However, the tdistribution will push Keisan close to its limits  and beyond. Try 1E100 and 50 degrees of freedom. The iteration returns a result, while Keisan returns a "system error". ;) Also, too much digits may cause Keisanproblems because of insufficient internal precision. So better try 18, 30 or 38 digits.
A realworld implementation would require a bit more, for instance a check for p = 0,5 where t = 0 (leading to a division by zero here).
Dieter
▼
Posts: 239
Threads: 9
Joined: Dec 2010
Thank you very much, Dieter!
Here's my JavaScript "production" code, based on yours:
"StudentTQuantile": function(df, x) {
if (df != Math.floor(df)) throw Error("bad arg");
if (x <= 0) return +Infinity; else if (x >= 1) return Infinity; else if (x == 0.5) return 0;
var t, p = Math.min(x, 1x), s = this.sign(0.5  x);
if (this.ln(p) < 1.7 * df) {
if (p < 0.2) {
var u = 2*this.ln(p);
x = Math.sqrt(2*this.ln(p*Math.sqrt((u1)*2*Math.PI))) + 1/(u*u);
}
else {
var u = (0.5p)*Math.sqrt(2*Math.PI);
x = u + (u*u*u)/6;
}
var x3 = x*x*x;
u = (x3 + x) / 4 / df;
v = (x3*x*x / 12 + x3 / 4) / (df*df);
t = x + u + v;
}
else {
u = 2*p*df*Math.sqrt(Math.PI/(2*df  1));
t = Math.sqrt(df) / Math.pow(u,(1/df));
}
var d, k = 10;
do {
d = (this.StudentTDistribution(df, t)  p) / this.StudentTPDF(df, t);
t += d;
}
while (d*d > 1e30 && k);
return t*s;
}
You'll notice the change in the termination criterion to get the result within one ULP (for IEEE754 doubles) but providing a hard iteration max to deal with too slow convergence and oscillations.
Here the results for your extreme values test:
p n exact result ND1 (max 10 iters) ND1 (max 100 iters)

1 E5 10 7.526954019339 7.526954019333 7.526954019334
1 E10 10 25.466008 25.466010 25.466008
1 E15 10 81.040891 80.3924 81.05831
1 E20 10 256.434699 Infinity Infinity
1 E40 10 2.564 E+05 Infinity Infinity
1 E100 10 2.564 E+10 19881486106.20 Infinity
I guess a PDF at the mercy of the gamma function, may, in part, explain the slow convergence.
I'm going with the max of 10 iterations. For nonextreme values, results within +/ 1 ULP seem to be found in typically 35 iterations. The first estimate is great!
Do you have an iOS device? I'd love to give you a redeem code for a copy of ND1 (and CalcPad when it comes out).
Cheers.
▼
Posts: 653
Threads: 26
Joined: Aug 2010
There must be a problem in the Student PDF or (more likely) the CDF. Here are the results evaluated by Keisan, set to 18 digits. All results were obtained after just 2 to 4 (!) iterations. Ten or even more iterations were not required. That's why I think the results of the CDF you use are not exact. But an exact CDF ist crucial for an exact quantile.
The initial estimate has a relative error order of 10^2. So quadratic convergence should give a 15digit result after three iterations, or let's say 4 or 5 under real world conditions. And that's exactly what I got. ;)
p n quantile algorithm @ Keisan

1 E5 10 7.526954 7.52695401933965106
1 E10 10 25.466008 25.4660080216977262
1 E15 10 81.040891 81.040890880039345
1 E20 10 256.434699 256.434699318526186
1 E40 10 2.564 E+05 2.5645257010761472 E+05
1 E100 10 2.564 E+10 2.5645257189481978 E+10
The results seem to be okay. Maybe someone can check them with Mathematica or Maple. I'll see what can be done in Excel/VBA (15 digit floating point arithmetics). Here, decent distribution functions and quantiles are sorely missing. Excel seems to have a very... "special" reputation among statisticians. ;)
BTW  no, I do not have any kind of idevice. I'm a bit conservative in these things. ;)
Dieter
▼
Posts: 239
Threads: 9
Joined: Dec 2010
The algorithm is great and not to be blamed for slow convergence.
You were quite right: I found the CDF to be the culprit. For t=25.4, df=10, I'm only getting 5 good digits after the point.
The PDF, conversely, is exact.
I'll see how I can improve this, but as results seem to be quite good already, and convergence does occur for most values with 2 (corrected from 3, after testing) to 5 iterations, I'm quite happy with it already.
The CDF isn't entirely flawed. For t=25.4, df=2 I'm getting 16 good digits.
It's obvious you can do real math. So, yes, I can see how you don't need a permanent math icompanion. ;)
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: Excel seems to have a very... "special" reputation among statisticians. ;)
It has a similar reputation amongst numerical analysts.
Hopefully, I'll get back to the statistical code this week.
 Pauli
Posts: 3,229
Threads: 42
Joined: Jul 2006
Good news, the better initial estimate is back in (or will be when we build next). Some flash space was recovered.
I'm not using Newton's method however, the pdf will be large and we're going to need the space for estimates for the other distributions I expect.
 Pauli
▼
Posts: 653
Threads: 26
Joined: Aug 2010
As far as I can see the max. error of the t quantile estimate is about 4%. So the initial interval could even be set to 0,95...1,05 t. ;)
I was about to write this and that concerning numeric accuracy in the distributions, e.g. using ln1+x and e^x1, and avoiding terms like p0.5 where values below a certain limit simply return 0,5 (cf Logistic quantile). But I see much of this work already has been done now, e.g. in the Exponential and Weibull CDF, but not in its QF which should be updated the same way. There also is an improvement in the tquantile for 1 d.o.f. where tan(pi*(p0,5)) was replaced by cot(p*Pi), which I wanted to suggest as well. So it seems this time you were faster, Pauli. At least at some points. ;)
The improved evaluation of the tquantile (1 d.o.f.) should also to be applied to the Cauchy distribution. The latter could be omitted completely since it's nothing but the tdistribution with 1 d.o.f. Without Cauchy there also is no need to rework its cdf  since the final step is adding 0,5 there is a point beyond which the result is a plain zero.
I did not look at the t CDF. Does it return 16 valid digits for any tvalue or is there something that can be improved?
By the way  how fast do these functions execute on a real 20s or 30s?
Dieter
Posts: 653
Threads: 26
Joined: Aug 2010
Pauli, I think there is a bug in the tquantile routine: the solver seems to exit as soon as the result has reached display accuracy. Try p = 0,1 and 4 d.o.f. and you will get 1,5535 in FIX 2, while ALL returns the correct result 1,5532.
The Student CDF also does not work far out in the tails. Again, with four degrees of freedom, t = 1E10 returns 3E40, and 1E20 should return 3E80, but instead you get a plain zero. I assume the reason is some kind of digit cancellation: the limit is t = sqrt(1E39). This also leads to substantial execution times for the quantile in that area, probably because more and more significant digits get lost as p drops below something like 1E70. Values below 1E78 simply throw a domain error.
Dieter
Edited: 7 May 2011, 4:57 p.m.
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: I think there is a bug in the tquantile routine: the solver seems to exit as soon as the result has reached display accuracy. Try p = 0,1 and 4 d.o.f. and you will get 1,5535 in FIX 2, while ALL returns the correct result 1,5532.
Nice catch. Fixed soon.
 Pauli
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: The Student CDF also does not work far out in the tails. Again, with four degrees of freedom, t = 1E10 returns 3E40, and 1E20 should return 3E80, but instead you get a plain zero. I assume the reason is some kind of digit cancellation: the limit is t = sqrt(1E39). This also leads to substantial execution times for the quantile in that area, probably because more and more significant digits get lost as p drops below something like 1E70. Values below 1E78 simply throw a domain error.
Not so much a digit cancellation as digit loss. The transform I used from the T cdf to the incomplete beta function involved a sqrt(v + t^{2}) term which got uninteresting as t got large.
I've found an alternative transformation that looks like it works a lot better.
Now to see if my Internet connection will allow me to commit these two changes or not...
 Pauli
Edited: 7 May 2011, 9:23 p.m.
