# HP Forums

Full Version: SandMath routine of the week: Inverse Gamma Function
You're currently viewing a stripped down version of our content. View the full version with proper formatting.

As a belated celebration of pi-day, here's a short routine to calculate the Inverse Gamma function (positive arguments only) -

Simply using Newton's method and the SandMath, it doesn't get any shorter - and on the CL it quickly converges after a few iterations... worth seeing it in action!

```01	LBL "IGMMA"
02	STO  01
03	LN           initial guess
04	STO  00
05	LBL  00
06	RCL  01
07	RCL  00
08	GAMMA
09	/
10	CHS
11	1
12	+
13	RCL  00
14	PSI
15	/
16	ST-  00
17	VIEW 00
18	ABS
19	1 E-8
20	X<Y?
21	GTO  00
22	RCL  00
23	END
```

Example: in addition to x=3, which other argument also returns Gamma(x) = 2 ?

Answer: 2, XEQ "IGMMA" -> 0,4428773962

Check: GAMMA -> 2.000000001

Cheers,
ÁM

Note: it's a rather crude first-pass, go ahead and improve on it if you're interested...

Edited: 15 Mar 2013, 10:48 a.m.

Buenas tardes, Ángel,

what do you use PSI for in your function set? TIA for enlightenment.

d:-?

The above example shows one usage. PSI is a handy trick to obtain the derivative of Gamma, as well as being very useful in probability problems.

Very nice short routine!
Here is the HP-50g version:

```%%HP: T(3)A(D)F(.);
\<< DUP LN
DO DUP2 GAMMA / NEG 1. + OVER Psi / - LASTARG DROP OVER
UNTIL ==
END NIP
\>>
2. --> 0.442877396485 --> 2.

However, GAMMA overflows for x=3 and x=4, for instance.
This will work for x > 0.886, but it's not a small program anymore:

%%HP: T(3)A(D)F(.);
\<< DUP LN OVER INV 1. UNROT 3. \->ARRY OVER 1.E12 < { [ 3.793 .4539 -1.952 ] } { [ 12.84 .2618 -388400000000. ] } IFTE DOT
DO DUP2 GAMMA / NEG 1. + OVER Psi / - LASTARG DROP OVER - ABS
UNTIL .0000000001 <
END NIP
\>>
x                  INVGAMMA(x)     GAMMA(INVGAMMA(x))   t(s)
1/2*sqrt(pi)           1.50000000008       0.886226925455    0.51
1.           2.00000000001                   1.    0.41
2.                      3.                   2.    0.37
4.           3.66403279721       4.000000000002    0.40
120.                      6.                 120.    0.31
87178291200.                     15.         87178291200.    0.34
1.26416966261E12        15.9876543210       1.26416966261E12  0.89
1.E60        48.3499572940       9.99999999872E59  0.46
7.77777777777E200        121.991689534      7.77777777648E200  3.42
```

Cheers,

Gerson.

We should have included PSI in the 34S too...

- Pauli

Glad it piqued your curiosity, Gerson. Interestingly enough the SandMath implementation does converge for x=3 and x=4, albeit it takes a long time. It's somewhat of a miracle because the estimations become negative for a while and we all know what happens to Gamma in the negative plane.

I guess there must be a clever way to tweak the calculation to reduce the number of iterations, perhaps keeping the action in the positive plane. In a google search I found almost no references to the InverseGamma function, so I guess it's totally uninteresting (which makes it attractive to my eyes :-)

Best,'AM

Edited: 16 Mar 2013, 6:21 a.m.

Something left for the 43S ;-) Anyway it's included as a library function AFAICS.

d:-)

Yes, it is in the standard library. It was also coded as an internal function but it got left out :-( The 43S will definitely have this one. A native inverse gamma is tempting too :-)

- Pauli

¡Hola, Ángel!

Quote:
In a google search I found almost no references to the InverseGamma function,

References to InverseGamma are indeed hard to find. I think I had already stumbled onto your first link last year when I was trying to implement this function. The best I was able to to gave only three digits. The method you have presented, using Psi, is the best I've ever seen. Thanks again for posting!

Here is a shorter and faster RPL version. The number of iterations is 6 or 7 for most arguments. It will work in the range 1 < x <= 1.34830732701E488.

Best regards,

Gerson.

```InvGamma:
%%HP: T(3)A(D)F(.);
\<< DUP LN DUPDUP \v/ 2.21 * OVER .16 * + .194 + / / DUP 2. < { DROP 2. } IFT
DO DUP2 GAMMA / NEG 1. + OVER Psi / - LASTARG DROP OVER - ABS
UNTIL .0000000001 <
END NIP
\>>
x                  InvGamma(x)       Gamma(InvGamma(x)) t(s)
1.00000000001           2.00000000001        1.00000000001    0.07
1.2           2.34593737867                  1.2    0.43
1.5           2.66276634532                  1.5    0.51
2.                      3.                   2.    0.53
4.           3.66403279721        4.00000000002    0.53
24.                      5.                  24.    0.40
120.                      6.                 120.    0.35
12345678.           11.5153468559        12345678.0005    0.35
87178291200.                     15.         87178291200.    0.33
1.26416966261E12        15.9876543210       1.26416966261E12  0.35
2.E30        29.5598564460       2.00000000026E30  0.34
3.E50        42.4792187916       1.99999999970E50  0.31
1.E60        48.3499572940       9.99999999872E59  0.29
5.E70        54.6170625140       4.99999999915E70  0.34
6.E100        71.3783728942      6.00000000063E100  0.51
9.91677934871E149                  97.      9.91677934871E149  0.66
7.77777777777E200        121.991689534      7.77777777648E200  0.41
8.E300        168.326353359      7.99999999942E300  0.56
1.E357        193.196868201      9.99999998531E356  0.83
9.E400        212.260312687      9.00000000902E400  1.09
1.23456789012E450        233.199708617      1.23456789318E450  1.47
1.34830732701E488        249.172968355      1.34830732373E488  1.79
```

Hi Gerson, I really have trouble to read RPL, can you provide the RPN equivalent for the calculation of the initial guess? Looks like a substantial improvement to the initial idea!

This is what I figured out:

```01 	LBL "IGMMA"
02	STO 01
03	ENTER^
04	ENTER^
05	LN
06	ENTER^
07	2.21
08	*
09	1/X
10	,16
11	*
12	+
13	,194
14	+
15	 /
16	 /
17	STO 00
18	LBL 00
19	RCL 01
20	RCL 00
21	GAMMA
22	 /
23	CHS
24	1
25	+
26	RCL 00
27	PSI
28	 /
29	ST- 00
30	VIEW 00
31	ABS
32	1E-8
33	X<Y?
34	GTO 00
35	RCL 00
36	END
```

Best, 'AM

Edited: 16 Mar 2013, 6:39 p.m.

Hi Ángel,

OVER is equivalent to RCL Y, but it's easier if you know where these constants and expression come from (see below *).

This can be a possible RPN equivalent for the first part of the program:

```01   LBL "IGMMA"
02   STO 01
03   ENTER^
04   LN
05   ENTER
06   SQRT
07   2,21
08   *
09   RCL Y
10   ,16
11   *
12   +
13   ,194
14   +
15   /
16   /
17   2
18   X<=Y?
19   X<>Y
20   STO 00
21   LBL 00
...
```

Cheers,

Gerson.

----------------

(*) The following is part of a curve fitting made with help of an old version of
DataFit.

```
Model Definition:
Y = x/(a+b*x+c*sqr(x))
Number of observations = 172
Number of missing observations = 0
Solver type: Nonlinear
Nonlinear iteration limit = 250
Diverging nonlinear iteration limit =10
Number of nonlinear iterations performed = 23
Residual tolerance = 0,0000000001
Sum of Residuals = -1,97982012453611E-02
Average Residual = -1,1510582119396E-04
Residual Sum of Squares (Absolute) = 0,071461272245453
Residual Sum of Squares (Relative) = 0,071461272245453
Standard Error of the Estimate = 2,05632625029686E-02
Coefficient of Multiple Determination (R^2) = 0,9994754674
Proportion of Variance Explained = 99,94754674%
Adjusted coefficient of multiple determination (Ra^2) = 0,99946926
Durbin-Watson statistic = 0,181006272327966
Regression Variable Results
Variable Value	                Standard Error	        t-ratio	        Prob(t)
a	 0,194180061078165	6,75779102290044E-02	2,873425065	0,00458
b	 0,160140574611016	5,79167114529964E-04	276,5014977	0,0
c	 2,20860298743044	1,42205819324635E-02	155,3103099	0,0
-------------------------------------------------------------------------------------------------------------------------------------
X Value            Y Value          Calc Y          Residual       % Error         Abs Residual    Min Residual   Max Residual
1             0	           0	           0	             0	           0	0	        -0,09225268066	0,09044327801
2  0,6931471804	0,2310490602	0,3233017409	-0,09225268066	-39,92774547	0,09225268066
3  1,791759469	0,4479398673	0,5212429458	-0,07330307854	-16,36449084	0,07330307854
4  3,17805383	        0,635610766	0,6848643811	-0,0492536151	-7,749021529	0,0492536151
5  4,787491743	0,7979152904	0,8263771459	-0,02846185547	-3,567027204	0,02846185547
6  6,579251212	0,9398930303	0,9517404203	-0,01184739003	-1,260504084	0,01184739003
7  8,525161361	1,06564517	1,064573069	0,001072101061	0,10060582	0,001072101061
...
167  691,1834011	4,114186911	4,091157824	0,02302908714	0,5597481991	0,02302908714
168  696,3073651	4,120161924	4,096382192	0,02377973199	0,5771552776	0,02377973199
169  701,4372638	4,126101552	4,101568196	0,02453335547	0,5945892306	0,02453335547
170  706,5730622	4,132006212	4,106716318	0,02528989421	0,6120487945	0,02528989421
171  857,9336698	4,289668349	4,241416568	0,04825178142	1,124837109	0,04825178142
172  1128,523771	4,514095084	4,423651805	0,09044327801	2,003574943	0,09044327801
-------------------------------------------------------------------------------------------------------------------------------------
```

In the X column, ln(InvGamma(x)); in the Y column, ln(InvGamma(x))/x. This was the best fit out of 57 models.

Edited: 16 Mar 2013, 9:52 p.m.

Hi Gerson, many thanks for the RPL help and the additional details on the data fit - yes, that's a good idea that has prompted me to look for other methods to get the initial estimation, with the goal to reduce the number of iterations even further (albeit your method is already very good, perfect for fast machines like the 50G and the CL :-)

I thought about some value based on the Stirling approximation, and that took me straight to the Lambert function. Now I can see how things fit in the first URL, so I'm using David Cantrell's formula as initial guess.

This works for x greater than x0=1.461632, the zero of PSI (aka the local minimum of Gamma for x>0). I've added a branch to deal with x<x0, also including arguments up to x=2 - somehow I like the "alternative" result more than the standard x=3. This branch goes to the negative values for the solution and takes a lot of iterations, but still it's better that nothing.

Below I've listed both versions, the one on the right side is using your approach. The other is two lines longer but uses the same room in memory. Indeed the number of iterations is reduced, saving precious time in slow machines - despite the initial front-loaded calculation of WL0.

Somehow is nice to have these three functions (Lambert, PSI and Gamma) in the same code, a nice trinity working together. Needless to say this will make it in the next revision of the SandMath, which BTW will have two bank-switched banks for each page - four in total.

```01 	LBL "IGMMA" 	LBL "IGMMA"
02	STO 01  	STO 01
03	2       	CF 01
04	X<>Y    	2
05	X>Y?            X<>Y
06	GTO 02  	X<=Y?
07	LN      	SF 01
08	X=0?    	ENTER^
09	1       	LN
10	STO 00  	FS? 01
11	GTO 00  	GTO 02
12	LBL 02  	ENTER^
13	,036534 	SQRT
14	+       	2.21
15	PI      	*
16	ST+ X   	RCL Y
17	SQRT    	,16
18	 /      	*
19	LN      	+
20	ENTER^  	,194
21	ENTER^  	+
22	1               /
23	E^X	        /
24	 /      	LBL 02
25	WL0     	X=0?
26	 /      	1
27	,5      	STO 00
28	+       	LBL 00
29	STO 00   	RCL 01
30	LBL 00  	RCL 00
31	RCL 01   	GAMMA
32	RCL 00	        /
33	GAMMA    	CHS
34	 /      	1
35	CHS     	+
36	1       	RCL 00
37	+       	PSI
38	RCL 00   	 /
39	PSI     	ST- 00
40	 /      	VIEW 00
41	ST- 00  	ABS
42	VIEW 00  	1E-8
43	ABS     	X<Y?
44	1E-8    	GTO 00
45	X<Y?    	RCL 00
46	GTO 00   	END
47	RCL 00
48	END
```

Cheers,
'AM.

Edited: 17 Mar 2013, 6:16 a.m.

Hi Ángel,

Quote:
somehow I like the "alternative" result more than the standard x=3. This branch goes to the negative values for the solution and takes a lot of iterations, but still it's better that nothing.

Wouldn't it be better to write s specific program suited for that branch? Perhaps only the initial estimates should be different. It is interesting to notice that PSI can be replaced with LN for the positive branch (arguments > 1, returning values > 2), since Psi(x) get closer to ln(x) as x grows. Thus, an HP-42S version is possible. There will be occasional differences of one unit it the last significant digits for arguments < 30 or so.

There is a couple of unnecessary operations in the RPL program above (also in the RPN version). Fix below.

Cheers,

Gerson.

----------------------------------------------------------

```InvGamma:
%%HP: T(3)A(D)F(.);
\<< DUP LN DUP \v/ 2.21 * SWAP .16 * + .194 + DUP 2. < { DROP 2. } IFT
DO DUP2 GAMMA / NEG 1. + OVER Psi / - LASTARG DROP OVER - ABS
UNTIL .0000000001 <
END NIP
\>>
```

```01   LBL "IGMMA"
02   STO 01
03   ENTER^
04   LN
05   ,16
06   X<>Y
07   *
08   LASTX
09   SQRT
10   2,21
11   *
12   +
13   ,194
14   +
15   2
16   X<Y?
17   X<>Y
18   STO 00
19   LBL 00
...
HP-42S version:
00 {65-Byte Prgm }       19 LBL 00
01 LBL "IGMM"            20 RCL 01
02 STO 01                21 RCL 00
03 ENTER                 22 GAMMA
04 LN                    23 /
05 0,16                  24 +/-
06 X<>Y                  25 1
07 *                     26 +
08 LASTX                 27 RCL 00
09 SQRT                  28 LN
10 2,21                  29 /
11 *                     30 STO- 00
12 +                     31 ABS
13 0,194                 32 1E-10
14 +                     33 X<Y?
15 2                     34 GTO 00
16 X<Y?                  35 RCL 00
17 X<>Y                  36 END
18 STO 00
2  XEQ IGGM   -->  3  ( ~6 s)
5040  XEQ IGGM   -->  8  ( ~3 s)
248 GAMMA  -->  2.09409007702E485 XEQ IGGM --> 248 ( ~12 s)
```

P.S.: This HP-42S version uses only the stack. There are no noticeable differences in the running times however. The conversion to the wp34s should be straightforward.

```00 {74-Byte Prgm }       19 Rv
01 LBL "IGMM"            20 Rv
02 ENTER                 21 STO ST Z
03 LN                    22 X<>Y
04 SQRT                  23 STO ST T
05 0.16                  24 GAMMA
06 X<>Y                  25 /
07 *                     26 +/-
08 X<> ST L              27 1
09 2,21                  28 +
10 RCL+ ST L             29 RCL ST Z
11 *                     30 LN
12 0,194                 31 /
13 +                     32 STO- ST Z
14 2                     33 ABS
15 X<Y?                  34 1E-10
16 X<>Y                  35 X<Y?
17 X<> ST T              36 GTO 00
18 LBL 00                37 R^
38 END
```

--------------------

Edited to remove my comment about your use of Lambert W. Somehow I imagined it was inside of loop. I think I hadn't woken up yet :-)

Edited: 18 Mar 2013, 12:52 a.m.

Very nice Gerson, I think we've nailed it - save the "left" branch of course.To be continued... :)

I went ahead and timed the execution times for both versions, see the table below. Using the more complex initial estimation yields consistently shorter times as you can see.

Times in seconds.Never mind the actual values (on V41, standard speed) but the deltas tell the story:

```x	D. Contrell	DataFit
1	2.370024	2.9339976
1.5	15,4800000	17,6000040
2	17,96998	17.219989
2.5	11.85998	17.469972
3	10.98   	17.66
3.5	10.36008	15.289992
4	10.47996	14.72004
4.5	10.179972	15.17004
5	10.110024	14.7900024
10	9.34992 	14.230008
15	8.740008	13.86
20	9.36    	14.349996
```

Cheers,
ÁM

Hi Ángel,

Significantly faster! Better use the curve fit equation only in programs for calculators that lack the Lambert W then.
Here is a WP 34S version. It's very for arguments near 1, probably because I am using LN instead of Psi. Lambert W, Gamma, pi and e are all there, only Psi is missing. (Psi is available in a library, but I have yet to read the WP 34S Blue Book and find out how to use it in a program).

Cheers,

Gerson.

```WP 34S version:
001 LBL A                 017 # 1/2
002 # pi                  018 +
003 # 086                 019 <> XZZX
004 /                     020 GAMMA
005 RCL+ Y                021 /
006 # pi                  022 +/-
007 STO+ X                023 INC X
008 SQRT                  024 RCL Z
009 /                     025 LN
010 LN                    026 /
011 ENTER^                027 y<> T
012 ENTER^                028 -
013 # eE                  029 CNVG? 00
014 /                     030 RTN
015 Wp                    031 BACK 012
016 /                     032 END

1  A   -->  2.000000000000012  ( 5.5 s)
2  A   -->  3.000000000000002  (3.2 s)

5040  A   -->  7.999999999999998  (1.8 s)

205  GAMMA  -->  1.326057243693621E384 A --> 205  ( 1.0 s)

x	 time(s)
0.89   20.6
0.90   12.5
1.00    5.5
1.50    3.4
2.00    3.2
2.50    2.7
3.00    2.8
3.50    2.8
4.00    2.8
4.50    2.8
5.00    2.9
10.00    2.2
15.00    2.2
20.00    2.3
100.00    2.2
500.00    2.0
5000.00    1.8
1.0E006    2.0
1.0E020    1.5
1.0E050    1.3
1.0E100    1.0
1.0E200    1.3
1.0E350    1.1
```

Quote:
Psi is available in a library, but I have yet to read the WP 34S Blue Book and find out how to use it in a program.

```        XEQ'[PSI]'
```

This can be entered via the keyboard and the g-shift Greek letters or via CAT and navigating there and pressing XEQ.

- Pauli

Edited: 19 Mar 2013, 2:27 a.m.

Nice use of some of the 34S's new features by the way.

Can I add this routine to the library?

- Pauli

Of course! Again, thanks Ángel for presenting us the method :-)

Gerson.

Gerson you're a gentleman - my pleasure of course, and thanks to you for the improvements.

Hello all,

nevertheless it is a interesting algorithm, but the hp50g can solve the problem with it's own built in function "ROOT", a very nice and short RPL proggie:

```
\<< \-> W              @Input y-value
\<<
'tX' GAMMA W =  @Equation which should be solved
'tX' 2 ROOT     @Command for solving with initial value 2
'tX' PURGE      @Delete temp. variable
\>>
\>>
```

Interesting feature: If you use a 1 as a initial guess, you get the negative solutions.

Greetings
peacecalc

Compatible with SSIZE8 and one step shorter:

```+	LBL'[GAMMA][^-1]'
+		# [pi]
+		# 86
+		/
+		RCL+ Y
+		# [pi]
+		STO+ X
+		SQRT
+		/
+		LN
+		[<->] XXYY
+		# eE
+		/
+		W[sub-p]
+		/
+		# 1/2
+		+
+		LBL 00
+			[<->] XZZX
+			GAMMA
+			/
+			+/-
+			INC X
+			RCL Z
+			LN
+			/
+			[<->] XZZY
+			-
+			CNVG? 00
+				RTN
+		GTO 00
+	END
```

I guess [<->] XZZY takes longer than y[<->] T, but I haven't noticed any difference when the argument is 0.89 (21 second with a chronometer, about the same time I obtained with TICK in the previous version).

DBLON would require the CVNG?02, but I don't see a way to do this without extra instructions inside to loop. Simply changing the parameter to 02 would cause endless loop for certain arguments.

Gerson.

Hi,

Interesting technique for using the numeric solver in a program.

Although the generic built-in solver can be used to compute this and other functions, like Lambert's W, for instance, a specific solver will usually be faster.

Cheers,

Gerson.

Gerson, thanks for this update.

I suspect there isn't much time difference between the shuffle command and the swap. The other overheads of interpreting instructions will likely far exceed the time to do either.

What about using CNVG? 03. This picks CNVG? 00 in single precision mode and CNVG? 02 in double.

- Pauli

Thanks! I'd found this information in page 91 of the Blue Book:

`"3 = Choose the best for the mode set, resulting in taking 0 for SP and 2 for DP (see Appendix H)."`

This is plain clear, but the reference to Appendix H somehow diverted me. I was expecting to find more information about CNVG?, but Walter was meaning more information about SP and DP, I realize now. So, let it be CNVG? 03 then.

Gerson.

Hello Gershon,

```Original from Gerson W. Barbosa: ... a specific solver will usually be faster
```

Of course that is right, but it is easy to use (without a great bunch of knowledge in numeric analysis). And I've stated an interesting behavior of my "brutal force" method: With huge arguments my 50g needs let's say some seconds for the solution, and with small arguments in the scale of 10, it is very fast compared to the values of post #15 that posted Gershon.

Certainley the duration of computation depends on the displaying mode because I use the built in function "ROOT".

Greetings
peaceglue

Thanks!

As I suspected, the use of LN instead of Psi is responsible for the overall slowness, especially for arguments in the beginning of the range. Calling the library digamma program would take up only one step, but it doesn't preserve the stack. This would have to be taken care of, which would require extra steps.

Since Psi need not be exact in this application, I decided to use an approximation:

```Psi(x) ~ ln(x) - 1/(2*x)
For the sake of crediting, the first character in Ángel's name is unreadable in the documentation, perhaps because of the diacritic.
Would 7 extra steps compensate for the faster execution?
Gerson.
[pre]
----------------------------------------------
LBL'[GAMMA][^-1]'
# [pi]
# 86
/
RCL+ Y
# [pi]
STO+ X
SQRT
/
LN
[<->] XXYY
# eE
/
W[sub-p]
/
# 1/2
+
LBL 00
[<->] XZZX
GAMMA
/
+/-
INC X
# 1/2
[<->] TXYZ
/
RCL L
LN
STO- Y
x[<->] L
[<->] YZTX
+/-
/
[<->] XZZY
-
CNVG? 03
RTN
GTO 00
END
----------------------------------------------
x	        time(s)
first    updated
version   version
0.89   20.6       4.4
0.90   12.5       3.5
1.00    5.5	      2.3
1.50    3.4       1.7
2.00    3.2       1.6
2.50    2.7       1.4
3.00    2.8       1.5
3.50    2.8       1.5
4.00    2.8       1.5
4.50    2.8       1.5
5.00    2.9       1.4
10.00    2.2       1.3
15.00    2.2       1.4
20.00    2.3       1.4
100.00    2.2       1.3
500.00    2.0	      1.3
5000.00    1.8       1.2
1.0E006    2.0       1.4
1.0E020    1.5       1.1
1.0E050    1.3       1.1
1.0E100    1.0       0.9
1.0E200    1.3       1.1
1.0E350    1.1       0.9
----------------------------------------------
```

Edited: 25 Mar 2013, 12:20 a.m.

Quote:
For the sake of crediting, the first character in Ángel's name is unreadable in the documentation, perhaps because of the diacritic.

The diacritic is gone now.

Quote:
Would 7 extra steps compensate for the faster execution?

In a library routine, I think so. If anyone objects, they can always key in the earlier version or delete the routine completely.

- Pauli

Here is another HP-42S version. Psi(x) is approximated as ln(x) - 1/(2*x) instead of simply ln(x) to speed execution up, especially for small arguments.

```00 {86-Byte Prgm }       23 STO ST Z
01 LBL "InvGam"          24 GAMMA
02 ENTER                 25 /
03 LN                    26 +/-
04 SQRT                  27 1
05 0.16                  28 +
06 X<>Y                  29 R^
07 *                     30 RCL+ ST T
08 X<> ST L              31 1/x
09 2.21                  32 RCL ST T
10 RCL+ ST L             33 LN
11 *                     34 STO- ST Y
12 0.194                 35 X<> ST L
13 +                     36 Rv
14 2                     37 +/-
15 X<Y?                  38 /
16 X<>Y                  39 STO- ST Z
17 X<> ST T              40 ABS
18 LBL 00                41 1E-9
19 Rv                    42 X<Y?
20 Rv                    43 GTO 00
21 STO ST Z              44 R^
22 X<>Y                  45 END
x                 InvGam(x)         GAMMA(InvGam(x)) t(s)
1                       2                    1     1.0
1.2           2.34593737867                  1.2     4.1
1.5           2.66276634533        1.50000000001     4.2
2                       3                    2     4.4
4           3.66403279721        4.00000000002     4.2
24                       5                   24     3.2
120                       6                  120     3.1
5040                       8                 5040     2.2
12345678           11.5153468559        12345678.0005     3.1
87178291200                      15          87178291200     3.0
1.26416966261E12        15.9876543210       1.26416966261E12   3.1
2.E30        29.5598564460       2.00000000026E30   3.1
2.E50        42.4792187916       1.99999999970E50   2.1
1.E60        48.3499572940       9.99999999872E59   2.5
5.E70        54.6170625140       4.99999999915E70   2.9
6.E100        71.3783728942      6.00000000063E100   4.2
9.91677934871E149                  97.      9.91677934871E149   5.9
7.77777777777E200        121.991689534      7.77777777648E200   3.5
8.E300        168.326353359      7.99999999942E300   4.8
1.E357        193.196868201      9.99999998531E356   6.9
9.E400        212.260312687      9.00000000901E400   9.0
1.23456789012E450        233.199708617      1.23456789318E450  11.9
1.34830732701E488        249.172968355      1.34830732373E488  14.6
```

Saving two instructions now, so five extra steps actually:

```	LBL'[GAMMA][^-1]'
# [pi]
# 86
/
RCL+ Y
# [pi]
STO+ X
SQRT
/
LN
[<->] XXYY
# eE
/
W[sub-p]
/
# 1/2
+
LBL 00
[<->] XZZX
GAMMA
/
+/-
INC X
RCL T
LN
RCL* L
STO+ X
DEC X
RCL/ T
/
STO+ X
[<->] XZZY
-
CNVG? 03
RTN
GTO 00
END

Running time comparion among versions from the
older to the more recent one, measured with TICKS
x           t (s)
0.89   20.6   4.4   4.5
0.90   12.5   3.5   3.6
1.00    5.5	  2.3   2.3
1.50    3.4   1.7   1.7
2.00    3.2   1.6   1.7
2.50    2.7   1.4   1.4
3.00    2.8   1.5   1.5
3.50    2.8   1.5   1.5
4.00    2.8   1.5   1.5
4.50    2.8   1.5   1.5
5.00    2.9   1.4   1.5
10.00    2.2   1.3   1.3
15.00    2.2   1.4   1.4
20.00    2.3   1.4   1.5
100.00    2.2   1.3   1.3
500.00    2.0	  1.3   1.3
5000.00    1.8   1.2   1.3
1.0E006    2.0   1.4   1.4
1.0E020    1.5   1.1   1.2
1.0E050    1.3   1.1   1.1
1.0E100    1.0   0.9   0.9
1.0E200    1.3   1.1   1.0
1.0E350    1.1   0.9   1.0
```

Gerson.

Edited: 28 Mar 2013, 7:19 p.m.

Fabulous work, Gerson!

Not quite!

I completely forgot to test it in SSIZE8 mode. When I finally did it this afternoon, I noticed it was taking much longer than when in SSIZE4 mode. Fortunately the fix is very easy: just replace the only RCL T instruction with RCL Z:

```                        ...
+/-
INC X
RCL Z
LN
RCL* L
...
```

Occasional stack-size-wise incompatibilities are a side effect of having two stack sizes. I only change to SSIZE8 when doing many calculations involving complex number (which lately I have no need to). I would prefer a parallel four-level stack for doing this, though, as in the HP-15C. But there's a reason this couldn't be implemented on the wp34s, so the 8-level stack for such situations is a blessing (but only in this case, IMHO). I wonder what would be the consequences of user-settable stack size, as some have suggested...

A really fabulous work was yours, adapting matrices routines to work on the wp34s!

Regards,

Gerson.

Obrigado, Gerson, por a sua programa!

Quote:
I only change to SSIZE8 when doing many calculations involving complex number (which lately I have no need to). I would prefer a parallel four-level stack for doing this, though, as in the HP-15C. But there's a reason this couldn't be implemented on the wp34s, so the 8-level stack for such situations is a blessing (but only in this case, IMHO).

All a matter of habitude: I run on SSIZE8 all the time. I love RPN calculating free of stack-overflow worries.

d:-)

Quote:
Obrigado, Gerson, por a sua programa!

Just a couple of corrections: por + a = pela. But is this case it should be "pelo seu programa" (por + o = pelo) because "programa" is a masculine noun. The gender of words ending in -a are most always feminine, as you know, but many nouns ending in -a derived from Greek are masculine, like programa, drama, diagrama, anagrama, for instance.

Quote:
All a matter of habitude: I run on SSIZE8 all the time.

Yes, I know there is a number of wp34s user who prefer the 8-level stack. That's why I realized all library programs should ideally run under both stack sizes, without changing the user setting. It would be easier just to save the user settings in the beginning of the program, set the proper stack size and restore the user settings when the program returns, but I don't know whether this is possible. At least I haven't found a way to do it (but I haven't read all the book yet).

Regards,

Gerson.

Making a program behave like a built in command is fairly easy. It is much easier in XROM but that isn't available to users.

```    LBL'ABC'
// Allocate local variables, save the stack and the current mode settings
LocR 09
STOS .01
STOM .00
// Do what ever you want to the machine's state and produce a result in X
.
.
.
// Recall the system stack, move the result into the saved stack and set up Last X
RCLM .00
x<> .01
STO L
RCLS 01
END
```

And despite Walter's comment an eight level stack can still run out, especially when programming :-)

- Pauli

Edited: 30 Mar 2013, 2:35 a.m.

Pauli,

Quote:
And despite Walter's comment an eight level stack can still run out, especially when programming

If you want to sabotage a system you can always, of course ;-) As stated above, I was talking about calculating not programming. Some minimum intelligence given, you cannot overload an eight-level stack in real calculations IMHO.

d:-)

Thanks! I would use this technique in complex and long programs. Preserving the stack only is a nice bonus. However, in a relatively short program like Gamma-1 seven extra steps might be too many.

Quote:
And despite Walter's comment an eight level stack can still run out, especially when programming :-)

I don't remember ever getting out of stack when doing calculations, but then again mine usually are not complicated. The 8-level stack should be hand for calculating this one, though:-)

Gerson.

Quote:
The 8-level stack should be hand for calculating this one, though:-)

Hmmmh - let me see:
```input         x            y         z         t
350 ENTER     350          350
661.5         661.5        350
/ x^2  .2     0.2          [...]^2
*
1 +           (...)
3.5 y^x
1 -           [...]                                 = factor 1
25500 ENTER   25500        25500     [f1]
6.875e-6      6.875e-6     25500     [f1]
*             ...          [f1]
1 x<>y        ...          1         [f1]
-             [..]         [f1]
-5.2656 y^x   [..]^(-5.3)  [f1]
*             {...}
1 +           (...)
0.286 y^x
1 -           [...]
5 *
SQRT
```
Unless I'm mistaken, three levels are sufficient here.

d:-?

Quote:
Unless I'm mistaken, three levels are sufficient here.

`0.835724531752514`
I remember there is an expression somewhere in the archives which is not solvable within four stack registers, unless intermediate results are stored in a numbered register, but I cannot find it right now.

This example is from one of many discussions about RPN versus AOS:

Edited: 30 Mar 2013, 6:16 p.m.

Quote:
I remember there is an expression somewhere in the archives which is not solvable within four stack registers, unless intermediate results are stored in a numbered register, but I cannot find it right now.

But there was an earlier thread: It Can Be Done!

Kind regards

Thomas

Thanks, Thomas!

Yes, that was the expression you presented in message #14 in the first link. Of course no problem with it on the WP-34S with SSIZE8 mode on.

Your comment is very impressive: "Whichever solution you might choose you will always run into a stack-overflow and what's worse you probably won't even notice."

Walter does have a point :-)

Cheers,

Gerson.