▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Since the conference is well and truly over, I thought I'd post my solutions and see what other people have come up with and what other improvements are possible.
Let the discussions and optimisations begin :-)
- Pauli
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
For the WP 34S.
A basic routine that runs over each value of X in the positive quadrant and
solves the circle equation to get the number of pixels set in that column.
001 [cmplx]DROP Ignore centre
002 0 Accumulator
003 RCL Y
004 STO[times] Z Keep r^2
005 LBL 00
006 DEC X
007 x<0?
008 GTO 01 Check for loop finish
009 ENTER[^]
010 x[^2]
011 RCL- T
012 +/-
013 [sqrt] sqrt(r^2 - x^2)
014 CEIL
015 STO+ Z Accumulated
016 DROP
017 GTO 00
018 LBL 01 Exit code -- multiply by 4
019 DROP
020 4
021 [times]
This program takes 6.5 seconds for a radius of 999. For a radius
of 5,000 it takes 32.7 seconds. Execution time is is linear in the
radius. It might be possible to tweak this a bit more and save a
few fractions of a second, however the execution time will remain
linear in the radius.
- Pauli
Edited: 26 Sept 2011, 2:17 a.m.
Posts: 3,229
Threads: 42
Joined: Jul 2006
For the WP 34S.
Rather than solving the circle equation each time, an incremental
approach is also possible. Bresenham's circle algorithm manages
to generate the outer points on a circle using only integer shifts
and additions. The following program implements Bresenham's circle
algorithm, however the results are slightly off because it is using
the pixel centre points rather than their extremes. This error is
almost certainly correctable without introducing a performance loss.
001 [cmplx]DROP Ignore centre
002 BASE 10 Integer mode
003 STO 00 Save radius as X coordinate
004 +/-
005 STO I Error accumulator
006 0
007 STO 01 Y coordinate
008 STO J Area accumulator
009 LBL 00
010 RCL 00
011 x<? 01 Check for loop termination
012 GTO 01
013 STO+ J Accumulate column
014 RCL 01 Update Y coordinate
015 SL 01
016 STO+ I Error = Error + 2 Y + 1
017 INC I
018 INC 01 Y = Y + 1
019 RCL I
020 x<0? Check for change in X
021 GTO 00
022 DEC 00 X = X - 1
023 RCL 00
024 SL 01 Error = Error – 2 X
025 STO- I
026 GTO 00
027 LBL 01 Finish up
028 RCL J Area in octant
029 SL 01 Area in quadrant plus duplicated overlap
030 RCL 01
031 x[^2] Correct the duplicated overlap
032 -
033 SL 02 Quadruple to full circle
034 DECM Switch mode back
Running time for a radius of 999 is 0.9 seconds, for a radius of
5,000 it is 4.7 seconds. Again this approach will give time linear
in the radius but each iteration is very fast.
- Pauli
Edited: 26 Sept 2011, 2:18 a.m.
▼
Posts: 2,761
Threads: 100
Joined: Jul 2005
Paul,
I was aware of this method. It was used in the Microsoft BASIC in my old MSX 8-bit computer. Great to see an implementation, even if the results are slightly diffent.
Gerson.
From the MSX Red Book:
----------------------------------------------------------------------------
Address... 5BBDH
This is the circle mainloop. Because of the high degree of
symmetry in a circle it is only necessary to compute the
coordinates of the arc from zero to forty-five degrees. The
other seven segments are produced by rotation and reflection of
these points. The parametric equation for a unit circle, with T
the angle from zero to PI/4, is:
X=COS(T)
Y=SIN(T)
Direct computation using this equation, or the corresponding
functional form X=SQR(1-Y^2), is too slow, instead the first
derivative is used:
dx
---- = -Y/X
dy
Given that the starting position is known (X=RADIUS,Y=0), the X
coordinate change for each unit Y coordinate change may be
computed using the derivative. Furthermore, because graphics
resolution is limited to one pixel, it is only necessary to
know when the sum of the X coordinate changes reaches unity and
then to decrement the X coordinate. Therefore:
Decrement X when (Y1/X)+(Y2/X)+(Y3/X)+... => 1
Therefore decrement when (Y1+Y2+Y3+...)/X => 1
Therefore decrement when Y1+Y2+Y3+... => X
All that is required to identify an X coordinate change is to
totalize the Y coordinate values from each step until the X
- 150 -
5. ROM BASIC INTERPRETER
coordinate value is exceeded. The circle mainloop holds the X
coordinate in register pair HL, the Y coordinate in register
pair DE and the running total in CRCSUM. An equivalent BASIC
program for a circle of arbitrary radius 160 pixels is:
10 SCREEN 2
20 X=160:Y=0:CRCSUM=0
30 PSET(X,191-Y)
40 CRCSUM=CRCSUM+Y :Y=Y+1
50 IF CRCSUM<X THEN 30
60 CRCSUM=CRCSUM-X:X=X-1
70 IF X>Y THEN 30
80 CIRCLE(0,191),155
90 GOTO 90
The coordinate pairs generated by the mainloop are those of a
"virtual" circle, such tasks as axial reflection, elliptic
squash and centre translation are handled at a lower level
(5C06H).
----------------------------------------------------------------------------
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: Great to see an implementation, even if the results are slightly diffent.
I'm sure the algorithm can be corrected to give the correct results. It is a matter of reworking the maths and changing the update part of the loop. I might do this if I get really keen, but I've already spent more time than I can afford on this interesting little challenge.
- Pauli
▼
Posts: 2,761
Threads: 100
Joined: Jul 2005
That was not a criticism. Actually, the way the circles are drawn have been simplified to make the challenge easier, as Walter has noticed. I guess both algorithms produce smoother circles, therefore no need to change them. The point is the existence of such algorithms, which were created in a time when speed really mattered.
Gerson.
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
No criticism was taken :-)
Definitely stick to integer mode and avoiding expensive operations is a win although a lot less so that when these algorithms were first discovered.
- Pauli
Posts: 3,229
Threads: 42
Joined: Jul 2006
For the WP 34S.
Again running of the value of x and solving the circle equation but in integer mode results in a faster program.
001 [cmplx]DROP Ignore centre
002 BASE 10
003 0 Accumulator
004 RCL Y
005 STO[times] Z Keep r^2
006 LBL 00
007 DEC X
008 x<0?
009 GTO 01 Check for loop finish
010 ENTER[^]
011 x[^2]
012 RCL- T
013 +/-
014 [sqrt] sqrt(r^2 - x^2)
015 FS? C
016 INC X
017 STO+ Z Accumulated
018 DROP
019 GTO 00
020 LBL 01 Exit code -- multiply by 4
021 DROP
022 SL 02
023 DECM
Timings in this case are 7.6 seconds for a radius of 5,000 and 1.4
seconds for a radius 999. Again, these timings are linear in the
radius of the circle.
One problem here. The integer square root code is incorrectly
rounding for some input values. Since a new firmware image was not
permitted by the rules of the contest, this solution is invalid. A fixed firmware image has been committed but not yet moved into a release package.
It should be possible to rectify the problem with the square root
by squaring the result and comparing against the initial value.
The error only occurs when the square root isn't exact and it only
seems to round the result up. Even with the extra steps involved,
this method would likely be faster than the floating point equivalent.
- Pauli
Edited: 26 Sept 2011, 2:29 a.m.
Posts: 3,229
Threads: 42
Joined: Jul 2006
Some other semi-random comments.
- I couldn't identify an explicit formula for the result. Such a formula will almost certainly be fastest.
- I tried using the integration command. It isn't very accurate on the 34S but it is very very fast. On a 15C LE, it took ages.
- Pauli
▼
Posts: 260
Threads: 0
Joined: Oct 2008
HI,
I come first to an algorithm real close to your code.
Here is a version for HP-41C:
01 LBL"HHC2011 13 - 25 +
02 0 14 SQRT 26 ST+ 00
03 * 15 RCL Z 27 CLx
04 * 16 * 28 1
05 STO 00 17 INT 29 +
06 LBL 00 18 LASTx 30 x<y?
07 ENTER^ 19 FRC 31 GTO 00
08 R^ 20 x=0? 32 4
09 / 21 GTO 01 33 RCL 00
10 x^2 22 CLx 34 *
11 1 23 1 35 END
12 x<->y 24 LBL 01
Some detail perhaps needs an optimization.
For RPL user, here a version for HP-28S calculator, which is basic enough to be used with out modification on any RPL system.
« -> r x y @ input three arguments as specified
« 0 @ initiate number of pixel in circle
0 r 1 - FOR k @ main loop k=0 to r-1
1 @ 1^2 is 1 !
k r / SQ - SQRT @ x = sqrt(1 -(k/r)^2)
r * CEIL @ n = CEIL(x.r) at row where y=k/r
+ @ add to number of pixel
NEXT
4 * @ multiply by 4 to get full cricle
»
»
'HHC2011' STO
Only one quadrant of the circle is investigated, total number of pixels is 4 times the quadrant count.
I may suggest that a version using only an half-quadrant may speed up the process and is coherent with symmetry. The total number of pixels will be 8 times one half-quadrant count.
@ version 2.0
« -> r x y @ input three arguments as specified
« 0 @ initiate number of pixel in circle
0 r 2 SQRT / FOR k @ main loop k=0 to r/SQRT(2)
1 @ 1^2 is 1 !
k r / SQ - SQRT @ x = sqrt(1 -(k/r)^2)
r * CEIL @ n = CEIL(x.r) at row where y=k/r
.5 k + - @ remove (k+0.5) pixels
+ @ add to number of pixel
NEXT
8 * @ multiply by 8 to get full cricle
»
»
'HHC2011b' STO
Here a proposed code for HP-15C and any other great classic HP calculators.
# --------------------------------------------
# Tcl/Tk HEWLETT·PACKARD 15C Simulator program
# Created with version 1.2.13
# --------------------------------------------
000 { }
001 { 42 21 11 } f LBL A
002 { 0 } 0
003 { 20 } ×
004 { 20 } ×
005 { 34 } x<->y
006 { 2 } 2 _
007 { 11 } \/x
008 { 10 } ÷
009 { 43 44 } g INT
010 { 44 0 } STO 0
011 { 42 21 0 } f LBL 0
012 { 43 33 } g R^
013 { 10 } ÷
014 { 43 11 } g x²
015 { 1 } 1
016 { 34 } x<->y
017 { 30 } - _
018 { 11 } \/x
019 { 43 33 } g R^
020 { 20 } ×
021 { 43 44 } g INT
022 { 43 36 } g LSTx
023 { 42 44 } f FRAC
024 { 43 20 } g x=0
025 { 22 1 } GTO 1
026 { 43 35 } g CLx
027 { 1 } 1
028 { 42 21 1 } f LBL 1
029 { 40 } +
030 { 2 } 2
031 { 15 } 1/x
032 { 30 } -
033 { 45 0 } RCL 0
034 { 30 } -
035 { 40 } +
036 { 42 5 0 } f DSE 0
037 { 43 5 0 } g CF 0
038 { 45 0 } RCL 0
039 { 43 30 3 } g TEST 3 (x >= 0 ?)
040 { 22 0 } GTO 0
041 { 33 } R_dwn
042 { 8 } 8
043 { 20 } ×
044 { 43 32 } g RTN
# --------------------------------------------
Comments and suggestions are welcome.
Edited: 26 Sept 2011, 6:28 a.m. after one or more responses were posted
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Interesting. I did do just an octant for the Bresenham approach. This algorithm generates just an octant of a circle and relies on symmetry to get the rest of the points on the circumference.
Rather than keeping track of the diagonal, I accumulated the total area under the octant, doubled this and removed the duplicate area which is a perfect square. This allowed everything to be kept in integer mode.
- Pauli
Posts: 653
Threads: 26
Joined: Aug 2010
Hi C.Ret,
Quote:
I come first to an algorithm real close to your code.
Here is a version for HP-41C:
(... 35 step program ...)
Some detail perhaps needs an optimization.
At least it can be done shorter and faster. ;-)
Here's basically the same algorithm with some improvements. R was moved under the root and a more elegant CEIL emulation was used. For positive arguments, ceil(x) equals int(x + sign(frc(x)). Due to the special sign-implemenation in the 41-series an additional test for non-zero x was required. All this makes the program shorter and faster. Add a label and rtn if you prefer. ;-)
01 RDN
02 RDN
03 STO 00
04 STO 01
05 DSE 00
06 GTO 01
07 GTO 02
08 LBL 01
09 RCL 01
10 x^2
11 RCL 00
12 x^2
13 -
14 SQRT
15 ENTER
16 FRC
17 x>0?
18 SIGN
19 +
20 INT
21 +
22 DSE 00
23 GTO 01
24 LBL 02
25 4
26 *
For r > 1 steps 06, 07 and 24 can be omitted.
The next improvement is an update of the user instructions, telling the user not to enter the center coordinates but just the radius. This would save two more steps. ;-)
Dieter
▼
Posts: 260
Threads: 0
Joined: Oct 2008
Hello
Thank you for the much more elegant CEIL way :
n INT Lastx FRAC SIGN + (will run whith any classic except special cas of SIGN of HP-41). I was trying to adapt my code for Hp-15c, but I was unable to lacate any SIGN function. May the sequence be used instead :
FIX 0 n RND ?
I will try it later and spare steps in HP-15 listing as you demonstre it to me.
Considering only a quadrant was a good idea.
Considering only an octrant will be a good one toot, if I had not made thinks the worst way. By scaning from 0 to r/sqrt(2), I spare about 30% of running time.
But, I only realize that it is possible to spare 70% of the loops by scanning k from (r-1) to r/sqrt(2) !
And using symetry to count this small octrand part two time and adding the "common" square.
A picture better than words :
« -> r x y @ input three arguments
« r 2 SQRT / CEIL -> p @ compute upper limit p
« 0 @ initiate number of pixel in octant
p r 1 - FOR k @ main loop k from r-1 downto p
1 @ 1^2 is 1 !
k r / SQ - SQRT @ x = sqrt(1 -(k/r)^2)
r * CEIL @ n = CEIL(x.r) at row where y=k/r
+ @ add to number of pixels (in region A)
NEXT
2 * @ add symetric/transpose part (region A')
p SQ + @ add inside square of p.p pixels (region B)
4 * @ multiply by 4 to get full cricle
»
»
»
'HHC2011c' STO
I hope this last approache will be the more efficient, especially for large circles, concerning number of loops.
Now 5000,0,0 HHC2011c returns result in less than 1'25" on HP-28S
I still using a complex computation at each step, the solution propose by Paul Dale based on a Bresemhan method trigger me. Have to found a way to spare operations and tests. Using an 'error' or 'indicator' counter is already tracked.
Edited: 27 Sept 2011, 12:17 p.m. after one or more responses were posted
▼
Posts: 653
Threads: 26
Joined: Aug 2010
Quote:
Thank you for the much more elegant CEIL way :
n INT Lastx FRAC SIGN +
Or also
ENTER
FRC
SIGN
+
INT
Quote:
(will run whith any classic except special cas of SIGN of HP-41).
Well, the 41 was the first HP (pocket) calculator I know of that featured a SIGN function. So none of the classics (and many later models as well) offered that function and this approach cannot be used there.
Quote:
I was trying to adapt my code for Hp-15c, but I was unable to lacate any SIGN function.
There you are. ;-)
Quote:
May the sequence be used instead :
FIX 0 n RND
?
No, this won't work. It simply rounds the argument to the next higher or lower (!) integer. CEIL always rounds up.
But since sign(x) = x / abs(x) for x<>0, it could be accomplished this way:
ENTER
FRC
ENTER
ABS
x=0?
+
x<>0?
/
+
INT
This works for x >= 0.
And finally, a more tricky version:
ENTER
FRC
x<>0?
e^x
SQRT
INT
+
INT
This also works for negative x.
Ah, those were the good old days where calculators were so slow that using transcendental functions didn't matter much. ;-)
Dieter
Posts: 68
Threads: 1
Joined: Jul 2005
Hi C.Ret,
My solution for the HP-15C also used the optimization that you posted, where it is only necessary to find the number of pixels in ~30% of the lines. The total number of pixels is then 4 * (s1 + 2 * s2) where s1 = (INT(r/sqrt(2)+1))^2 and s2 is the number sum of the number of pixels on each line above the square on the main diagonal as shown in your picture.
Lines 29 to 34 are yet another variant on the CEIL function.
For the radius=5000 case on the HP-15C LE the program below computes the number of pixels in 16 seconds.
Best regards,
Eamonn.
001 LBL C
002 Rv
003 Rv
004 STO 0 // Radius (r) -> Reg 0
005 x^2
006 STO 3 // r^2 -> Reg 3
007 RCL 0
008 2
009 SQRT
010 /
011 1
012 STO -0 // r-1 -> Reg 0
013 +
014 INT
015 STO 1 // y1 -> Reg 1
016 X^2
017 STO 2 // s1 -> Reg 2
018 CLX
019 STO 4 // 0 -> s2
020 LBL A
021 RCL 0 // (r-1)
022 RCL 1 // y
023 TEST 7 // x<y?
024 GTO B
025 x^2
026 CHS
027 RCL +3 // r^2-y^2
028 SQRT
029 FRAC
030 TEST 0 // x<>0
031 1
032 LAST X
033 +
034 INT // X = ceil(r^2-y^2)
035 STO +4 // s2 = s2 + x
036 1
037 STO +1
038 GTO A
039 LBL B
040 2
041 RCL *4
042 RCL +2
043 4
044 *
045 RTN
Posts: 275
Threads: 41
Joined: Mar 2010
Here is an HP41 program with your idea (take R in X)
01 *LBL'Y
02 ENTER^
03 X^2
04 STO 00
05 2
06 /
07 STO 01
08 X<>Y
09 CF 00 ***
10 0
11 DSE Y
12 GTO 00 ***
13 1 ***
14 GTO 01 ***
15 *LBL 00
16 RCL 00
17 RCL Z
18 X^2
19 -
20 RCL 01
21 X<=Y?
22 SF 00
23 FS? 00
24 X<>Y
25 RDN
26 SQRT
27 INT
28 LASTX
29 FRC
30 X#0?
31 SIGN
32 +
33 FS?C 00
34 GTO 01
35 +
36 DSE Y
37 GTO 00
38 *LBL 01
39 X^2
40 X<>Y
41 ENTER^
42 +
43 +
44 4
45 *
46 END
*** step are for R=1 case (can be removed)
for the rounding in 'old' classic you can use
INT
LASTX
FRAC
X#0? *** SIGN calculate
ENTER^ *** duplicate if not zero
X#0? *** still the same to test at
/ *** can divide as not zero an give a real SIGN function :)
+
as a plus it does not perturb the stack too much
I use this on an HP97 :)
Olivier
Edited: 27 Sept 2011, 2:51 a.m.
▼
Posts: 275
Threads: 41
Joined: Mar 2010
A faster one (shorter loop no R=1 special case, R still in X)
01 *LBL'Y (53 bytes)
02 ENTER^
03 STO 00
04 STO 01
05 STOx 01
06 2
07 SQRT
08 /
09 INT
10 LASTX
11 FRC
12 X#0?
13 SIGN
14 +
15 X^2
16 LASTX
17 RCL Z
18 -
19 X<>Y
20 *LBL 00
21 RCL 00
22 RCL Z
23 +
24 X^2
25 RCL 01
26 -
27 CHS
28 SQRT
29 INT
30 LASTX
31 FRC
32 X#0?
33 SIGN
34 +
35 STO+ X
36 +
37 ISG Y
38 GTO 00
39 4
40 *
41 END
Some timings 50 -> 8024 in 8 sec
100 -> 31796 in 16 sec
540 -> 918168 in 82 sec
1000 -> 3145520 in 150 sec
5000 -> 78559640 in 750 sec (should be 30 sec with a CL ...)
HP42S one, better use of stack and recall arithmetic
01 *LBL"Y"
02 ENTER
03 STO 00
04 STO 01
05 STOx 01
06 2
07 SQRT
08 /
09 IP
10 LASTX
11 FP
12 X#0?
13 SIGN
14 +
15 X^2
16 LASTX
17 RCL- ST Z
18 X<>Y
19 *LBL 00
20 RCL 01
21 RCL 00
22 RCL+ ST T
23 X^2
24 -
25 SQRT
26 INT
27 LASTX
28 FRC
29 X#0?
30 SIGN
31 +
32 RCL+ ST X
33 +
34 ISG ST Y
35 GTO 00
36 4
37 *
38 END
sorry, no timing ...
Edited: 28 Sept 2011, 6:58 a.m. after one or more responses were posted
▼
Posts: 275
Threads: 41
Joined: Mar 2010
A stack only version for hp42s
00 { 58-Byte Prgm }
01>LBL "X"
02 ENTER
03 ENTER
04 STO+ ST Y
05 2
06 SQRT
07 ÷
08 IP
09 LASTX
10 FP
11 X!=0?
12 SIGN
13 +
14 X^2
15 LASTX
16 RCL- ST T
17 +/-
18 X<>Y
19>LBL 00
20 RCL ST Z
21 RCL- ST Z
22 RCL× ST Z
23 SQRT
24 FP
25 X!=0?
26 SF 00
27 X<> ST L
28 IP
29 FS?C 00
30 ISG ST X
31>LBL 01 *** as NOP
32 STO+ ST X
33 +
34 DSE ST Y
35 GTO 00
36 4
37 ×
38 END
start with a default state (as said) so flag 0 reset
A faster one for HP41
01>LBL "W"
02 ENTER^
03 STO 00
04 STO+ 00
05 2
06 SQRT
07 ÷
08 INT
09 LASTX
10 FRC
11 X!=0?
12 SIGN
13 +
14 X^2
15 STO 01
16 X<>Y
17 LASTX
18 -
19 0
20>LBL 00
21 RCL 00
22 RCL Z
23 -
24 LASTX
25 *
26 SQRT
27 INT
28 LASTX
29 FRC
30 X!=0?
31 SIGN
32 +
33 +
34 DSE Y
35 GTO 00
36 ST+ X
37 RCL 01
38 +
39 4
40 ×
41 END
timing (not the same machine, here a 41C, before a 41CX)
50 in 7 sec
540 in 68 sec
1000 in 126 sec
5000 in 620 sec
Edited: 28 Sept 2011, 6:08 a.m.
▼
Posts: 163
Threads: 7
Joined: Jul 2007
I did it slightly differently, with about the same timings:
41C 34 bytes, 5000: 10m 11.56s 78'559'640
Original with explanation (still at 37 byes) here
Posts: 275
Threads: 41
Joined: Mar 2010
A better one for 42S (shorter loop)
01 >LBL "X"
02 STO 00
03 STO+ 00
04 STO 01
05 2
06 SQRT
07 ÷
08 IP
09 LASTX
10 FP
11 X!=0?
12 SIGN
13 +
14 X^2
15 X<> 01
16 RCL- ST L
17 0
18 >LBL 00
19 RCL 00
20 RCL- ST Z
21 RCL× ST Z
22 SQRT
23 IP
24 LASTX
25 FRAC
26 X!=00?
27 SIGN
28 +
29 DSE ST Y
30 GTO 00
31 RCL+ ST X
32 RCL+ 01
33 4
34 ×
35 END
Posts: 25
Threads: 1
Joined: Sep 2011
I haven't written a program for the 41 since 1987 when I got my 28C, but I can't pass up a good contest. I dusted off my calculator and manual and here's what I came up with:
01 LBL"PC Program Contest
02 RDN drop center values
03 RDN
04 STO 00 radius -> 00
04 STO 01 loop index -> 01
06 X^2
07 STO 02 r^2 -> 02 for faster loop
08 0 sum on stack
09 LBL 00
10 RCL 02 r^2
11 RCL 01 r^2, i+1
12 1 r^2, i+1, 1
13 - r^2, i
14 X^2 r^2, i^2
15 - r^2-i^2
16 SQRT sqrt(r^2-i^2)
17 INT 17-24 is my CEIL function
18 LASTX
19 FRC
20 CHS
21 SIGN
22 X>0?
23 CLX
24 - end of CEIL function
25 + add to running total
26 DSE 01
27 GTO 00
28 4 times 4 quadrants
29 *
30 END
(Surely there's a better way to do the CEILING function. ???)
This program would have been even more interesting if we did not assume that the circle fit on the display panel. Also, about the idea of only counting pixels that are at least 50% covered, is there an easy way to do that without integrals?
-Wes L
▼
Posts: 653
Threads: 26
Joined: Aug 2010
Quote:
(Surely there's a better way to do the CEILING function. ???)
At least there's another option (as well as that using SIGN mentioned in other posts). ;-)
The '41 features a MOD function, so a true CEIL function can be setup like this:
ENTER
ENTER
-1
MOD
-
That -1 may be stored somewhere beforehand to increase execution speed.
Once again, tricks like these are not required on the 35s since it features an INTG function (equivalent to FLOOR elsewhere):
CEIL(x) = -INTG(-x).
+/-
INTG
+/-
In our case, instead of adding CEIL(x) we can simply subtract INTG(-x):
...
+/-
INTG
-
...
Dieter
Posts: 468
Threads: 17
Joined: May 2011
Another RPL solution :
« DROP2 -> r '4*Sum(n=0,r-1,CEIL(Sqrt(1-(n/r)^2)*r))' »
DROP2 is just to ignore x and y. Sum and Sqrt are the special symbols for
Very simple but takes ~ 75s. for 5000 radius in aprox mode
Edited: 28 Sept 2011, 5:49 p.m.
▼
Posts: 260
Threads: 0
Joined: Oct 2008
Salut Gilles,
A better solution :
« DROP2 -> r
« r 2 SQRT / CEIL -> p '4*Sum(k=p,r-1,p*p+2*CEIL(Sqrt(1-(k/r)^2)*r))'
»
»
This version spare execution time for large radious r.
For explanations see posts HHC 2011 Programming Pulzze - Spoiler -
Posts: 2,761
Threads: 100
Joined: Jul 2005
Quote:
I couldn't identify an explicit formula for the result. Such a formula will almost certainly be fastest.
I've found this approximation empirically from my results:
n ~ r*(pi*r + 4)
Gerson.
Edited: 26 Sept 2011, 8:37 a.m.
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
I also figured out a couple of approximations but they aren't good enough :-(
On the 34S using integrate results in a very fast and approximate answer:
001 [cmplx]DROP
002 x^2
003 STO 00
004 0
005 RCL L
006 [integrate] 00
007 4
008 *
009 RTN
010 LBL 00
011 x^2
012 RCL- 00
013 +/-
014 [sqrt]
015 CEIL
An equivalent program on the 15C LE caused the integrator to run for ages -- I gave up after quite a few minutes.
- Pauli
Posts: 2,761
Threads: 100
Joined: Jul 2005
Here's mine. I hope there are no other HP-12C program so I can win in the fastest HP-12C program category :-)
Cheers,
Gerson.
▼
Posts: 275
Threads: 41
Joined: Mar 2010
40 bytes for Hp 42S (or HP 41) and 2 registers, rouding ok
01 * LBL "Z"
02 RCL ST Z
03 STO 00
04 STO 01
05 STOx 01
06 DSE 00
07 GTO 00 ***
08 GTO 01 ***
09 *LBL 00
10 RCL 01
11 RCL 00
12 X^2
13 -
14 SQRT
15 INT
16 LAST X
17 FRC
18 X#0? new rounding
19 SIGN
20 +
21 +
22 DSE 00
23 GTO 00
24 *LBL 01 ***
25 4
26 x
27 END
Steps with *** can be removed for r > 1
Less register for 42 bytes, HP 42 or 41, rounding ok
01 * LBL "Z"
02 RCL ST Z
03 ENTER
04 ENTER
05 STO 00
06 STOx 00
07 DSE ST Y
08 GTO 00 ***
09 GTO 01 ***
10 *LBL 00
11 RCL 00
12 RCL ST Z
13 X^2
14 -
15 SQRT
16 INT
17 LAST X
18 FRC
19 X#0? new rounding
20 SIGN
21 +
22 +
23 DSE ST Y
24 GTO 00
25 *LBL 01 ***
26 4
27 x
28 END
Only stack !!! only for HP 42 and 44 bytes (if we count a reg as 7 bytes it's a winner: now the rounding is correct :) )
01 * LBL "Z"
02 Rv
03 Rv
04 ENTER
05 ENTER
06 ENTER
07 STOx ST Z
08 DSE ST Y
09 GTO 00 ***
10 GTO 01 ***
11 *LBL 00
12 RCL ST Y
13 RCLx ST Z
14 RCL- ST T
15 +/-
16 SQRT
17 +
18 IP
19 LAST X
20 FP
21 X#0? new rounding
22 SIGN
23 +
24 DSE ST Y
25 GTO 00
26 *LBL 01 ***
27 4
28 x
29 END
( FIX 00 RND doesn't work :( need to work a bit more harder.
But I have no stack left ... see later )
Rounding for the 2 first programs is ok now :)
And for the third also :)
Olivier
Edited: 26 Sept 2011, 3:21 p.m. after one or more responses were posted
▼
Posts: 653
Threads: 26
Joined: Aug 2010
The first two versions will not work correctly. The CEIL function here was implemented by adding 0,5 and rounding to the next integer. This is basically the same as a simple 1 + INT. That's why the routine causes as error if the argument of the CEIL function is an integer. For instance, for r = 5 there are two points where the sqrt is exactly 3 resp. 4. At that point adding 0,5 will cause the value to get rounded up to 4 resp. 5 (instead of leaving it at 3 resp. 4). So the returned result is not 88 (correct), but 96 instead.
Dieter
▼
Posts: 275
Threads: 41
Joined: Mar 2010
You were right, I see the bug just after posting.
The correction is now done for the three programs :)
Olivier
Posts: 653
Threads: 26
Joined: Aug 2010
The basic approach is similar to my suggestion for the '41, so both can be improved this way: Instead of starting the loop with x = r - 1 (which is done by the first DSE) let's begin with x = r. This will not change the result since the first loop will add sqrt(r^2 - r^2), i.e. a plain zero. But it can handle the case r = 1 without any additional code, thus saving four steps, while on the other hand one more loop has no perceivable effect on execution time. It even handles r = 0 correctly. ;-)
Here are two versions of this algorithm. It's very basic and for sure not optimized for speed, but I like it for its compact code and readability. Especially the 35s version. ;-)
HP-41/42 HP35s
01 LBL "HHC" H001 LBL H
02 STO 00 H002 STO R
03 x^2 H003 STOx R
04 STO Z (42s: STO ST Z) H004 STO X
05 LastX H005 RCL R
06 LBL 01 H006 RCL X
07 R^ H007 x^2
08 RCL 00 H008 -
09 x^2 H009 sqrt
10 - H010 +/-
11 SQRT H011 INTG
12 ENTER^ H012 -
13 FRC H013 DSE X
14 X#0? H014 GTO H005
15 SIGN H015 4
16 + H016 x
17 INT H017 RTN
18 +
19 DSE 00
20 GTO 01
21 4
22 *
23 END
R is assumed in to be entered X. Add 2x RDN if you want to do it according to the original rules.
Dieter
Posts: 2,761
Threads: 100
Joined: Jul 2005
Here's the second version of my 12C program. No attempt to optimize it for least numbered registers usage has been made. Actually, I should be doing my homework instead :-)
-----------------------------------------------
Keystrokes |Display |
[f][P/R] | |
[f]CLEAR[PRGM] |00- |
[ENTER] |01- 36 |
[STO]0 |02- 44 0 |
[ENTER] |03- 36 |
[ENTER] |04- 36 |
[x] |05- 20 |
[STO]1 |06- 44 1 |
[CLx] |07- 35 |
[STO]2 |08- 44 2 |
[RDN] |09- 33 |
2 |10- 2 |
[g][SQRT] |11- 43 21 |
[/] |12- 10 |
[-] |13- 30 |
[g][INTG] |14- 43 25 |
[-] |15- 30 |
[STO]4 |16- 44 4 |
[x<>y] |17- 34 |
[STO]3 |18- 44 3 |
[g][x<=y] |19- 43 34 |
[g][GTO]36 |20- 43,33 36 |
1 |21- 1 |
[STO][-]3 |22- 44 30 3 |
[RCL]0 |23- 45 0 |
[RCL]1 |24- 45 1 |
[RCL]3 |25- 45 3 |
[ENTER] |26- 36 |
[x] |27- 20 |
[-] |28- 30 |
[g][SQRT] |29- 43 21 |
[-] |30- 30 |
[g][INTG] |31- 43 25 |
[STO][-]2 |32- 44 30 2 |
[RCL]4 |33- 45 4 |
[RCL]3 |34- 45 3 |
[g][GTO]19 |35- 43,33 19 |
[RCL]2 |36- 45 2 |
[ENTER] |37- 36 |
[+] |38- 40 |
[RCL]1 |39- 45 1 |
[+] |40- 40 |
[RCL]4 |41- 45 4 |
[RCL]0 |42- 45 0 |
[-] |43- 30 |
[ENTER] |44- 36 |
[x] |45- 20 |
[+] |46- 40 |
4 |47- 4 |
[x] |48- 20 |
[g][GTO]00 |49- 43,33 00 |
[f][P/R] | |
1 R/S -> 4 ( - )
5 R/S -> 88 ( - )
540 R/S -> 918,168 ( 3.1 s)
5000 R/S -> 78,559,640 ( 25.5 s)
(timings on the 12C+, firmware date 2008-06-28)
-----------------------------------------------
5 CLS
10 INPUT R
15 R2 = R * R
20 L = R - INT(R * (1 - 1 / SQR(2)))
30 I = R: S = 0
35 IF I <= L THEN 70
40 I = I - 1
50 S = S - INT(R - SQR(R2 - I * I))
60 GOTO 35
70 N = 4 * ((L - R) ^ 2 + R * R + 2 * S)
80 PRINT R, N
90 GOTO 10
-----------------------------------------------
▼
Posts: 2,761
Threads: 100
Joined: Jul 2005
HP-41CX, HP-42S and wp-34s versions
------------------------------------------------------------
HP-41CX:
01>LBL'CR
02 STO 00
03 ENTER^
04 ENTER^
05 ENTER^
06 2
07 SQRT
08 /
09 -
10 INT
11 STO 01
12 STO 02
13 CLX
14 STO 03
15 CLX
16 RCL Y
17 *
18>LBL 00
19 DSE 00
20 RCL X
21 RCL 00
22 X^2
23 -
24 SQRT
25 RCL Z
26 -
27 IP
28 STO+ 03
29 RDN
30 DSE 01
31 GTO 00
32 RCL 03
33 STO+ X
34 +
35 RCL 02
36 X^2
37 +
38 4
39 *
40 .END.
1 XEQ CR -> 4 ( 1.7 s )
5 R/S -> 88 ( 1.9 s )
540 R/S -> 918,168 ( 1 min 18.7 s )
5000 R/S -> 78,559,640 ( 12 min 12.1 s )
------------------------------------------------------------
HP-42S:
00 { 57-Byte Prgm }
01>LBL "CR"
02 STO 00
03 ENTER
04 ENTER
05 ENTER
06 2
07 SQRT
08 ÷
09 -
10 IP
11 STO 01
12 STO 02
13 CLX
14 STO 03
15 RCL+ ST Y
16 ×
17>LBL 00
18 DSE 00
19 RCL 00
20 X^2
21 RCL- ST Y
22 +/-
23 SQRT
24 RCL- ST Z
25 IP
26 STO+ 03
27 Rv
28 DSE 01
29 GTO 00
30 RCL 03
31 STO+ ST X
32 +
33 RCL 02
34 X^2
35 +
36 4
37 ×
38 X<0?
39 +/-
40 .END.
1 XEQ CR -> 4 ( 1.0 s )
5 R/S -> 88 ( 1.1 s )
540 R/S -> 918,168 ( 45.6 s )
5000 R/S -> 78,559,640 ( 6 min 58.3 s )
------------------------------------------------------------
wp-34s:
001 LBL C
002 STO 00
003 FILL
004 2
005 SQRT
006 /
007 -
008 IP
009 STO 01
010 STO 02
011 CLx
012 STO 03
013 RCL+ Y
014 ×
015 DEC 00
016 RCL 00
017 x^2
018 RCL- Y
019 +/-
020 SQRT
021 RCL- ST Z
022 IP
023 STO+ 03
024 Rv
025 DSE 01
026 BACK 11
027 RCL 03
028 STO+ X
029 +
030 RCL 02
031 x^2
032 +
033 4
034 ×
035 RTN
1 XEQ C -> 4 ( - )
5 R/S -> 88 ( - )
540 R/S -> 918,168 ( 1.2 s )
5000 R/S -> 78,559,640 ( 8.7 s )
------------------------------------------------------------
▼
Posts: 556
Threads: 9
Joined: Jul 2007
42S on iPhone 3.1.3 brings:
5000 -> 78,559,640 -> 0.5 sec
;)
Edited: 29 Sept 2011, 2:03 a.m.
Posts: 163
Threads: 7
Joined: Jul 2007
41C 37 bytes
r= 100 8.75s 31'796
r=5000 6m 58.66s 78'559'640
*LBL"PX"
RDN
RDN
STO 01
CLST
*LBL 02 X Y Z T
INT n x s
ST+ Z
SIGN
+ x s
RCL 01
ST+ X 2r x s
RCL Y x 2r x s
-
LASTX
*
SQRT
CHS
RCL 01
+
X>Y? can never be equal
GTO 02
SIGN 1 x s s
- x-1 s s s
X^2 x-1^2
-
+ 2*s-(x-1)^2
RCL 01
X^2
-
-4
*
END
I move the center of the circle to (r,r) and consider only the quadrant where x,y <= r.
Then a pixel (x,y) is *dark* when it is below the arc, or:
y <= r - sqrt(x*(2*r-x))
The number of dark pixels below the arc for a given x is:
n(x) = INT(r - sqrt(x*(2*r-x)))
And I'm finally rid of that pesky CEIL ;-)
I determine s = sum of all n(x) till n<x, then the total number of dark pixels in the quadrant is:
2*s - (x-1)^2
Well Gene, that was a good one. Endless fun.
Cheers, Werner
▼
Posts: 275
Threads: 41
Joined: Mar 2010
Very nice idea to count the non lit pixels ...
PS: Werner, you have a strange 41C, timing your program on my 41CX give me 13.6 sec for 100, 1min10 for 540 and 10min52 for 5000 ??
Some variation
HP41
{54 BYTES}
01>LBL'V
02 STO 00
03 STO 01
04 ST+ 01
05 RCL 00
06 2
07 SQRT
08 /
09 -
10 INT add after 10: X=0?
11 RCL 00 SF 00
12 X^2
13 RCL Y
14 X^2
15 +
16 2
17 / add after 17: FS?C 00
18>LBL 00 GTO 01
19 RCL 01
20 RCL Z
21 -
22 LASTX
23 *
24 SQRT
25 RCL 00
26 -
27 INT
28 +
29 DSE Y
30 GTO 00 add after 30: LBL 01
31 8 to have a good result for R=1,2,3
32 *
33 END
a bit faster : 540 in 1min02, 5000 in 9min26
For HP42, not timed but the inner loop is shorter :)
00 {47 BYTES}
01>LBL'V
02 STO 00
03 ST+ ST X
04 RCL 00
05 RCL 00
06 2
07 SQRT
08 /
09 -
10 INT
11 RCL 00
12 X^2
13 RCL Y
14 X^2
15 +
16 2
17 /
18>LBL 00
19 RCL ST Z
20 RCL- ST Z
21 RCL* ST Z
22 SQRT
23 RCL- 00
24 INT
25 +
26 DSE ST Y
27 GTO 00
28 8
29 *
30 END
apply the same correction to get good result for R=1,2,3
Edited: 29 Sept 2011, 4:49 p.m.
▼
Posts: 163
Threads: 7
Joined: Jul 2007
My CV died a few years ago, so I rely on the excellent simulator found on TOS. I had no idea the timings were off, though.
Cheers, Werner
Posts: 1,619
Threads: 147
Joined: May 2006
Quote:
wp-34s:
001 LBL C
002 STO 00
003 FILL
004 2
005 SQRT
006 /
007 -
008 IP
009 STO 01
010 STO 02
011 CLx
012 STO 03
013 RCL+ Y
014 ×
015 DEC 00
016 RCL 00
017 x^2
018 RCL- Y
019 +/-
020 SQRT
021 RCL- ST Z
022 IP
023 STO+ 03
024 Rv
025 DSE 01
026 BACK 11
027 RCL 03
028 STO+ X
029 +
030 RCL 02
031 x^2
032 +
033 4
034 ×
035 RTN
1 XEQ C -> 4 ( - )
5 R/S -> 88 ( - )
540 R/S -> 918,168 ( 1.2 s )
5000 R/S -> 78,559,640 ( 8.7 s )
------------------------------------------------------------
Same idea, but less steps, and all stack (and I think a bit faster):
001 RCL Z
002 FILL
003 STO+ Z
004 2
005 SQRT
006 /
007 CEIL
008 STO Z
009 STOx Z
010 -
011 RCL T
012 RCL- Y
013 RCLx Y
014 SQRT
015 CEIL
016 STO+ Z
017 STO+ Z
018 DROP
019 DSE X
020 BACK 09
021 4
022 RCLx Z
Instead of using y=sqrt(r^2-x^2), I moved the center of the circle to 0,r and used y=sqrt(x*(2r-x)). This made it easier to use DSE to catch the edge between the y-axis and the r/sqrt(2) square. I have not read all the other solutions, so this may have been covered.
▼
Posts: 2,761
Threads: 100
Joined: Jul 2005
Quote:
Same idea, but less steps, and all stack (and I think a bit faster)
Very nice! It's a bit faster indeed:
540 R/S -> 918,168 (* 1.1 s )
5000 R/S -> 78,559,640 (* 8.2 s )
(*) with a chronometer
Or, using the built-in TICKS command, 8 and 71 ticks, respectively, which multiplied by my unit's correction factor 1/9 gives 0.9 s and 7.9 s. Mine takes 8 and 75 ticks, respectively. A built-in quartz would be handy.
001 LBL A
002 TICKS
003 STO 00
004 x<>y
005 XEQ C (your program)
006 TICKS
007 RCL- 00
008 x<>y
009 RTN
Quote:
I have not read all the other solutions,
Same here, except for Werner's solution, from which I borrowed the idea of checking r*(1 - 1/sqrt(2)) columns instead of r/sqrt(2). I wrote my first draft as soon as I saw Gene's post and stuck to my first idea. I've computed the dark pixels too, so I never needed the CEIL function.
Gerson.
Edited: 29 Sept 2011, 9:30 p.m.
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: Or, using the built-in TICKS command, 8 and 71 ticks, respectively, which multiplied by my unit's correction factor 1/9 gives 0.9 s and 7.9 s
On a crystal unit, it would be 7.1 s.
TICKS remains constant regardless of the correction factor.
The time, however, doesn't unless a crystal is installed.
- Pauli
▼
Posts: 1,619
Threads: 147
Joined: May 2006
I tried to create a BASE 10 version, but got some errors, perhaps it is the SQRT bug you mentioned. Perhaps you can give it a try. It should run in about 1s.
Edited: 29 Sept 2011, 9:44 p.m.
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
I've not reflashed my 34S for ages so it still has the sqrt bug.
The bug, when it occurs, gives a result 1 too high and only when carry is set. So a sequence that checks carry and if it is set squares the number and compares against the original...
Or reflash with the latest from the sourceforge site and get a working integer SQRT.
- Pauli
Posts: 3,283
Threads: 104
Joined: Jul 2005
Quote:
TICKS remains constant regardless of the correction factor. The time, however, doesn't unless a crystal is installed.
RTFS (Read The Fine Sources): In a non crystal equipped calculator, the PLL runs at a higher multiplication factor and the interrupts are scaled up slightly to compensate for a slower R/C clock:
if ( Xtal ) {
/*
* Schedule the next one so that there are 50 calls in 5 seconds
* We need to skip 3 ticks in 128 cycles
*/
if ( Heartbeats == 40 || Heartbeats == 81 || Heartbeats == 122 ) {
UserHeartbeatCountDown = 6;
}
else {
UserHeartbeatCountDown = 5;
}
}
else {
/*
* Without a crystal a less accurate timing is good enough
*/
UserHeartbeatCountDown = Heartbeats & 1 ? 4 : 5;
}
The code is executed from the LCD refresh interrupt occurring every 640 slow clock cycles (roughly 50 Hz). TICKS are counted when the UserHeartbeatCountDown value reaches zero. In a R/C environment, the frequency is assumed to be about 15% slower.
Why all this? The slow clock outputs a 32KHz nominal clock but the rate is typically 15 to 20% less then that without the crystal. The PLL multiplies this value to derive the processor clock. I use higher values for the PLL when no crystal is installed. The periodic interrupt is not controlled by the PLL but directly by the 32KHz oscillator through the LCD controller and needs a separate correction algorithm.
In short: The TICKS results should be similar with or without a crystal but they typically don't exactly match. Also the resulting execution speed should be roughly the same between devices with or without the crystal.
Posts: 1,619
Threads: 147
Joined: May 2006
Quote:
... dark pixels ...
Sounds ominous. :-) Kinda like dark matter--you cannot see it, but you can measure it, or so it is argued.
Posts: 2,761
Threads: 100
Joined: Jul 2005
This is my third (and last) attempt. More steps, one register and slightly slower than yours.
Center of the circle at (0,r) and y=sqrt[(r+(r+x))*(r-(r-x))], with no simplication.
001 LBL C
002 FILL
003 2
004 SQRT
005 /
006 CEIL
007 STO 00
008 -
009 0
010 Rv
011 Rv
012 RCL- T
013 RCL+ Y
014 RCL* T
015 SQRT
016 CEIL
017 STO+ Z
018 DROP
019 ENTER
020 DSE T
021 BACK 09
022 RCL 00
023 x^2
024 R^
025 STO+ X
026 +
027 4
028 *
029 RTN
1 XEQ C -> 4 ( - )
5 R/S -> 88 ( - )
540 R/S -> 918,168 ( 1.1 s *)
5000 R/S -> 78,559,640 ( 8.3 s *)
(*) Or, using TICKS, 0.9 s and 8.0 seconds (8 and 72 TICKS respectively)
▼
Posts: 2,761
Threads: 100
Joined: Jul 2005
Quote:
my third (and last) attempt
Next to last, I mean :-)
Nothing but the stack. Well, it looks like Egan's, only somewhat less efficient. Currently I cannot experiment with CLx (which perhaps could be useful inside the loop) because it does not disable stack lift (outdated firmware here).
001 LBL C
002 FILL
003 2
004 SQRT
005 /
006 CEIL
007 -
008 RCL L
009 x^2
010 Rv
011 Rv
012 RCL- T
013 RCL+ Y
014 RCL* T
015 SQRT
016 CEIL
017 STO+ X
018 STO+ Z
019 DROP
020 ENTER
021 DSE T
022 BACK 10
023 RCL Z
024 4
025 *
026 RTN
8 and 73 TICKS for r=540 and r=5000, respectively.
Thanks, Egan, for the free RPN programming lesson!
▼
Posts: 2,761
Threads: 100
Joined: Jul 2005
Quote:
Next to last, I mean :-)
Antepenultimate attempt, that was!
A little trick to save a sum inside the loop (steps 10, 11, then final multiplication by 8 instead of 4). It's possible this has already been used, however.
001 LBL C
002 FILL
003 2
004 SQRT
005 /
006 CEIL
007 -
008 RCL L
009 x^2
010 2
011 /
012 Rv
013 Rv
014 RCL- T
015 RCL+ Y
016 RCL* T
017 SQRT
018 CEIL
019 STO+ Z
020 DROP
021 ENTER
022 DSE T
023 BACK 09
024 8
025 RCL* T
026 RTN
Now 8 and 72 TICKS for r=540 and r=5000, respectively.
▼
Posts: 275
Threads: 41
Joined: Mar 2010
Stack only but for a real HP made by HP (software too) ;)
00 {48 BYTES}
01>LBL'Y
02 RCL ST X
03 RCL ST X
04 2
05 SQRT
06 /
07 -
08 IP
X=0?
SF 00
09 ENTER
10 X^2
11 R^
12 X^2
13 +
14 2
15 /
FS?C 00
GTO 01
16>LBL 00
17 RCL ST Z
18 RCL+ ST T
19 RCL- ST Z
20 RCL* ST Z
21 SQRT
22 RCL- ST T
23 IP
24 +
25 DSE ST Y
26 GTO 00
LBL 01
27 8
28 *
29 END
non working for R=1,2,3 correction in steps not numbered (add some step, but don't slow the inner loop)
The trick was not to use CEIL but INT (see other posts, counting 'black' dots)
Edited: 4 Oct 2011, 2:58 a.m.
▼
Posts: 163
Threads: 7
Joined: Jul 2007
some remarks:
1.replace
RCL ST X
RCL ST X
by
ENTER
STO ST Z
saving a byte
2.you need a CF 00 instruction at the beginning, of course, bringing the byte count to 10 to cater for the exceptions.
3.replace
RCL ST Z
RCL+ ST T
with (shorter by one byte):
RCL ST Z
STO+ ST X
Here's my 41C version, with only 8 bytes overhead (but a bit more stack handling as the 41 does not have register recall arithmetic)
*LBL"PIXELS"
ENTER
STO Z r r r
2
SQRT
/
-
INT x r r r
ENTER
X^2
RUP
X^2
+
2
/ s x r r
X<>Y x s r r
X=0?
GTO 00
STO T
RDN
L X Y Z T
*LBL 02 s r x
RCL Y r s r x
ST+ X 2r s r x
RUP x 2r s r
ST- Y x 2r-x s r
ST* Y x x(2r-x) s r
X<>Y x(2r-x) x s r
SQRT
RUP
ST- Y r -n x s
X<>Y
INT [-n] r x s
RUP
+
DSE Z
GTO 02
0
*LBL 00
+
8
*
END
Cheers, Werner
▼
Posts: 275
Threads: 41
Joined: Mar 2010
Thanks for the optimization, but no need for CF 00, you start the program in a default state (see the puzzle post) so it's useless ;)
Olivier
Posts: 2,761
Threads: 100
Joined: Jul 2005
RPL stack only (HP-50g):
%%HP: T(3)A(D)F(,);
\<< DUP DUP 2, \v/ / CEIL - DUP2 - SQ 2, / UNROT
WHILE DUP
REPEAT DUP2 - PICK3 SWAP + LASTARG - * \v/ CEIL 4, ROLL + UNROT 1, -
END DROP2 8, *
\>>
22.76 seconds, for r=5000 (timed with TEVAL)
It uses the same formula in my last wp34s program above, probably not the best one in this case, both speedwise and sizewise.
Quote:
non working for R=1,2,3 correction in steps not numbered (add some step, but don't slow the inner loop)
In order to fix the same problem in the first version of the RPL program above, I simply moved the test to the beginning of the loop, by replacing DO UNTIL with WHILE REPEAT.
"Endless fun", as Werner already said.
Gerson.
▼
Posts: 875
Threads: 37
Joined: Jul 2005
I have been fooling around with the challenge since Sept. 25. I think I have reached the point of diminishing returns, or at least should stop spending time on it. So I'll present some results, maybe that will stop the obsessive tweaking.
Using the inscribed square reduction depicted here (which I did develop on my own on Sept. 26 before looking at any of the solutions), the following listing is about the best I can come up with for wp34s. It returns 78,528,204 for a radius of 4999 in about 8.20 seconds.
001 LBL B
002 RCL Z Recall stack Z, since radius is there per rules
003 ENTER Get another copy of the radius on the stack
004 STO 08 Store radius for future use
005 STO + 08 double the stored value of the radius for future use
006 x^2 square the radius
007 2 enter 2 to compute area of square inscribed in a quarter circle
008 / divide 2 into radius squared to determine area of square inscribed in quarter circle
009 SQRT take square root to determine side length of square inscribed in quarter circle
010 CEIL go up to next whole pixel
011 STO 06 store side length in register that will accumulate area
012 STO x 06 multiply side length by stored side length to compute area of inscribed square in whole pixels
013 - subtract size of square from radius to determine how many remaining columns must be summed
014 STO I store in register I for looping to compute height of remaining columns
015 RCL 8 Recall twice the radius
016 RCL - I Recall index and subtract from twice the radius
017 RCL x I Recall I, multiply times above to form 2xRadiusxIndex-Index^2
018 SQRT square root to compute square root of 2xRadiusxIndex-Index^2
019 CEIL compute height of row in full pixels if any part is touched by a circle of the given radius
020 STO + 06 add to summation of area
021 STO + 06 add again for area of upper wedge
022 DSE I decrement index, skip when less than zero
023 BACK 8 loop back if more columns
024 4 enter 4
025 RCL x 06 multiply times 4 to compute the area of the full circle in whole
pixels if any part of a pixel is within the circle of the given radius
026 STOP done
Summing the area in the stack appears to be a popular option, so I created the listing below. It is one step longer, and also seems to run ever-so-slightly slower, returning the result for 4999 in about 8.35 seconds.
001 LBL B
002 RCL Z Recall stack Z, since radius is there per rules
003 ENTER Get another copy of the radius on the stack
004 STO 08 Store radius for future use
005 STO + 08 double the stored value of the radius for future use
006 x^2 square the radius
007 2 enter 2 to compute area of inscribed square
008 / divide 2 into radius squared to determine area of square inscribed in quarter circle
009 SQRT take square root to determine side length of square inscribed in quarter circle
010 CEIL go up to next whole pixel
011 x^2 square side length to compute area of inscribed square in whole pixels
012 x<>y swap to get radius
013 RCL - L subtract size of square from radius to determine how many remaining columns must be summed
014 STO I store in register I for looping to compute height of remaining columns
015 x<>y swap to get area of square to stack X for area sum start
016 RCL 8 Recall twice the radius
017 RCL - I Recall index and subtract from twice the radius, forms 2xRadius-Index
018 RCL x I Recall I, multiply times above to form 2xRadiusxIndex-Index^2
019 SQRT square root to compute square root of 2xRadiusxIndex-Index^2
020 CEIL compute height of row in full pixels if any part is touched by a circle of the given radius
021 + add to summation of column heights
022 RCL + L add again for area of upper wedge
023 DSE I decrement index, skip when less than zero
024 BACK 08 loop back if more columns
025 4 enter 4
026 x multiply times 4 to compute the area of the full circle in whole
pixels if any part of a pixel is within the circle of the given radius
027 STOP done
Observations - I don't claim the above to be the best possible, just likely the best I can do with reasonable effort.
I have of course also worked on versions for the 15c LE, the latest of which returns the answer for 4999 in about 19 seconds. I may have to keep working on that one...
...
Edited: 6 Oct 2011, 8:42 a.m. after one or more responses were posted
▼
Posts: 1,619
Threads: 147
Joined: May 2006
I posted this a while back, but it has be archived:
001 RCL Z
002 FILL
003 STO+ Z
004 2
005 SQRT
006 /
007 CEIL
008 STO Z
009 STOx Z
010 -
011 RCL T
012 RCL- Y
013 RCLx Y
014 SQRT
015 CEIL
016 STO+ Z
017 STO+ Z
018 DROP
019 DSE X
020 BACK 09
021 4
022 RCLx Z
Gerson has been building on it and timing it, but I think it runs in less than 8 s for r=5000. To make the programing smaller and to also deal with small r I moved the circle origin to r,0 and then used y=sqrt(x*(2r-x)). That with a bit of gym stacktics help reduce the number of instructions and the use of x^2. I would like to get it down to 1 sec with the use of integer mode, however my firmware still has the isqrt bug and I cannot flash (tried 3 different machines and 3 different OSes) until I get a machine with a real serial port.
Edited: 7 Oct 2011, 10:12 a.m.: Changed 0,r to r,0 in description
Edited: 7 Oct 2011, 10:12 a.m. after one or more responses were posted
▼
Posts: 875
Threads: 37
Joined: Jul 2005
Quote:
I posted this a while back, but it has be archived:
I recall seeing it, but did not study it closely. Not much fun to use somebody else's idea(s) to improve your own program, after all.
Quote:
Gerson has been building on it and timing it, but I think it runs in less than 8 s for r=5000
I get about 8.2 seconds, which is virtually the same as my versions. Not too surprising, since even though your method is more elegant, it still requires the same number of loops.
Quote:
I moved the circle origin to 0,r and then used y=sqrt(x*(2r-x)).
Are you sure that you did not move the origin to r,0? Either way, a clever approach.
Quote:
I would like to get it down to 1 sec with the use of integer mode,
I am running v2.2 1674. In BASE10 integer mode, when I take the square root of a non-perfect square, it returns the square root of the largest perfect square less than the original argument. Is this the behavior you are after? I tried a kludgy modification to your code to use this, and it returns 78,528,204 for a radius of 4999 in about 1.9 seconds. Unfortunately, it also returns the wrong answers for radii that produce perfect squares at certain points in the process, so it needs work. But it is fast!
...
Edited: 6 Oct 2011, 8:55 a.m. after one or more responses were posted
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: Unfortunately, it also returns the wrong answers for radii that produce perfect squares at certain points in the process, so it needs work. But it is fast!
Try checking the carry flag (C). This is cleared if the square root was of a perfect square and set otherwise -- a very handy feature of integer mode.
- Pauli
▼
Posts: 875
Threads: 37
Joined: Jul 2005
I figured there must be a feature to make this work. I'll give it a try.
▼
Posts: 1,619
Threads: 147
Joined: May 2006
Change:
SQRT
CEIL
to:
SQRT
FS? C
INC X
▼
Posts: 3,283
Threads: 104
Joined: Jul 2005
I have a useful modification in mind: CEIL in integer mode should add the carry and clear it thereafter. Pauli, any ideas?
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
An interesting idea. CEIL is already defined in integer mode but doesn't do anything very interesting.
I'm not sure CEIL is the right command for this. A pair of add carry and subtract carry instructions would be more general and meaningful I think.
We'd really do best by having add with carry and subtract with borrow instructions.
- Pauli
▼
Posts: 3,283
Threads: 104
Joined: Jul 2005
My idea was to make CEIL behaving similarly in DECM and integer modes. If all functions that would normally return a non integer result (such as 1 ENTER 2 /) set the carry, CEIL would be working the same in both environments.
ADC and SBB seem natural for integer mode. I will have to learn ARM assembly to port this stuff to native ARM code.
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: My idea was to make CEIL behaving similarly in DECM and integer modes. If all functions that would normally return a non integer result (such as 1 ENTER 2 /) set the carry, CEIL would be working the same in both environments.
I believe most operations that can return a fractional result do set carry appropriately. If any don't (and the 16C did), please let me know.
The problem here is that there are plenty of other ways to set it. Addition and subtraction set it for overflow & borrow and shifts and rotates set it as well.
As things stand CEIL does work the same in both integer and floating point modes -- it rounds up to the nearest integer. In integer mode this means the number doesn't change which is correct.
Quote: ADC and SBB seem natural for integer mode. I will have to learn ARM assembly to port this stuff to native ARM code.
Do this and we'll save more than a bit of space in the integer support -- simulating the CPU flags in C is a huge consumer of code space. My ARM assembly is passable but not great.
- Pauli
Edited: 6 Oct 2011, 3:47 a.m.
Posts: 875
Threads: 37
Joined: Jul 2005
Yes, that is what I did for the second use of SQRT. For the first, I think you need to change:
2
SQRT
/
CEIL
to:
RCLx X (or x^2)
2
/
SQRT
INC X
Your program, with the above changes, now looks like this:
001 BASE 10
002 RCL Z
003 FILL
004 STO+ Z
005 RCLx X
006 2
007 /
008 SQRT
009 INC X
010 STO Z
011 STOx Z
012 -
013 RCL T
014 RCL- Y
015 RCLx Y
016 SQRT
017 FS? C
018 INC X
019 STO+ Z
020 STO+ Z
021 DROP
022 DSE X
023 BACK 10
024 4
025 RCLx Z
026 DECM
The above clocks in at right about 2 seconds for 4999.
Edited: 6 Oct 2011, 8:07 a.m.
▼
Posts: 1,619
Threads: 147
Joined: May 2006
If
019 STO+ Z
is changed to
019 SL 1
Does it help?
Edited: 6 Oct 2011, 11:28 a.m.
▼
Posts: 875
Threads: 37
Joined: Jul 2005
It is getting hard to say since I am timing things with my running watch, but it seems to get the time for 4999 down to about 1.9 seconds.
...
Posts: 3,229
Threads: 42
Joined: Jul 2006
Shifts are faster than addition or multiplication in integer mode. Dealing with carry and overflow consumes lots of cycles.
The 2 / sequence could be replaced with SR 01 which is shorter and faster.
- Pauli
Edited: 6 Oct 2011, 5:49 p.m.
▼
Posts: 875
Threads: 37
Joined: Jul 2005
Pauli,
Although I would not expect a measurable improvement in execution time in this particular instance (since the 2 / sequence is executed only once), shorter is better so I made the change. But, since the SR 01 function divides by 2 in-situ, it messes with Egan’s clever gymstacktics, so the program returns incorrect answers if only that change is made. Some other change(s) would have to be made, for example following the SR 01 with ENTER DROP. Of course that takes more steps than are saved by the SR 01 instruction. Probably possible to use SR 01 and save the step, but I will leave that as an exercise for the reader.
...
Posts: 1,619
Threads: 147
Joined: May 2006
Quote:
Are you sure that you did not move the origin to r,0? Either way, a clever approach.
Typo, yes, I moved it to r,0.
Posts: 2,761
Threads: 100
Joined: Jul 2005
You can use the built-in TICKS command instead of a chronometer. The latter adds about a 200 ms delay, no matter how fast you think you are (most likely your program takes 8 seconds or less for r=5000 as well). For times in seconds, you have to find the correction factor unless your wp34s has a built-in crystal. Or simply compare the number of TICKS returned by program A to determine which program is faster.
001 LBL C 001 LBL D
002 FILL 002 FILL
003 STO+ Z 003 STO+ Z
004 2 004 2
005 SQRT 005 SQRT
006 / 006 /
007 CEIL 007 CEIL
008 STO Z 008 STO Z
009 STOx Z 009 STO* Z
010 - 010 -
011 RCL T 011 .
012 RCL- Y 012 5
013 RCLx Y 013 STO* Z
014 SQRT 014 DROP
015 CEIL 015 RCL T
016 STO+ Z 016 RCL- Y
017 STO+ Z 017 RCL* Y
018 DROP 018 SQRT
019 DSE X 019 CEIL
020 BACK 09 020 STO+ Z
021 4 021 DROP
022 RCLx Z 022 DSE X
023 RTN 023 BACK 08
024 8
025 RCL* Z
026 RTN
027 LBL A
028 TICKS
029 STO 10
030 X<>Y
031 XEQ B
032 TICKS
033 RCL- 10
034 RTN
TICKS for 5000 XEQ A
C: 71 (7.9 seconds )
D: 70 (7.8 seconds )
C is Egan Ford's original program, except for the first step, which has been suppressed.
D is a slightly modified version of his program, no significant gain however.
I wasn't able to time your programs because my wp34s lacks DSL (it needs a firmware update).
Gerson.
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
With a crystal installed, taking the difference of two TICKS calls should be fairly accurate.
Without a crystal, the difference of two TICKS calls will still returns the same duration, it just won't necessarily match with real time so well.
- Pauli
▼
Posts: 3,283
Threads: 104
Joined: Jul 2005
Quote:
Without a crystal, the difference of two TICKS calls will still returns the same duration, it just won't necessarily match with real time so well.
The way it's implemented the two scenarios might differ. I tried to get TICKS right in either environment but the accuracy is dependent on the actual R/C frequency.
Edited: 6 Oct 2011, 2:20 a.m.
Posts: 1,545
Threads: 168
Joined: Jul 2005
These are all great, and I thank all of you for your interest in this problem.
I plan to write-up the winning entries from the conference as well as including the entries here. I will need a bit of time to key them in and time them with my sample input values for comparison to the conference results.
For what it is worth, this contest was open to all RPN machines.
I now have a solution running on the HP 67. :-)
It is a bit slow...
▼
Posts: 1,545
Threads: 168
Joined: Jul 2005
to a radius of 4999 in about 1.4 hours. :-)
78,528,204.
Edited: 27 Sept 2011, 12:28 p.m. after one or more responses were posted
▼
Posts: 275
Threads: 41
Joined: Mar 2010
To reduce time, this one can be faster ...
(I don't have my 97 at hand)
001 STOI *** I Index register
002 STO0
003 STO1 *** one register
004 STx1 *** one register
005 2
006 SQRT
007 /
008 INT
009 LSTX
010 FRC
011 X#0?
012 ENT^
013 X#0?
014 /
015 +
016 X^2
017 LSTX
018 RCLI *** I Index register
019 -
020 STOI *** I Index register
021 Rv *** R down
022 *LBL0
023 RCL1 *** one register
024 RCL0
025 RCLI *** I Index register
026 +
027 X^2
028 -
029 SQRT
030 INT
031 LSTX
032 FRC
033 X#0?
034 ENT^
035 X#0?
036 /
037 +
038 ENT^
039 +
040 +
041 ISZI
042 GTO0
043 4
044 x
on a 97 it should be around 50 minutes for 4999
Just timed it it is more like 30 minutes :)
(an hp41 is 12 minutes, so it's not too bad)
(but a 32SII is 3 minutes with the same program ...)
BUG on HP97: don't work for R < 3 : ISZI peculiarity : changes
change old 21 by
CF0
X=0?
SF0
Rv
F?0
GTO1
add before old 43
LBL1
it is not needed for HP32SII (use ISG instead of ISZ)
Edited: 28 Sept 2011, 4:17 a.m. after one or more responses were posted
▼
Posts: 275
Threads: 41
Joined: Mar 2010
Faster one ? (shorter loop, need timing)
001 ENT^
002 STO0
003 ST+0
004 2
005 SQRT
006 /
007 INT
008 LSTX
009 FRC
010 X#0?
011 ENT^
012 X#0?
013 /
014 +
015 X^2
016 X<>Y
017 LSTX
018 -
019 STOI *** I Index register
020 X=0?
021 SF2
022 CLX
023 0
024 F2?
025 GTO2
026 *LBL0
027 RCL0
028 RCLI *** I Index register
029 -
030 LSTX
031 x
032 SQRT
033 INT
034 LSTX
035 FRC
036 X=0?
037 GTO1
038 CLX
039 EEX
040 *LBL1
041 +
042 +
043 DSZI
044 GTO0
045 ENT^
046 +
047 *LBL2
048 +
049 4
050 x
Edited: 28 Sept 2011, 11:51 a.m.
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: I now have a solution running on the HP 67. :-)
It is a bit slow...
Anyone going to try on the HP 25 or 10C :-)
I might see if my Elektronika MK-61 is still functioning and up to the task. It will win the slowness stakes I'm sure.
- Pauli
▼
Posts: 2,761
Threads: 100
Joined: Jul 2005
Quote:
Anyone going to try on the HP 25 or 10C :-)
What about the HP-33C? Under half an hour, even less if the loop is optimized to use the stack whenever possible.
Gerson.
-----------------------------------------------
Keystrokes |Display |
| |
[f]CLEAR[PRGM] |00- |
[ENTER] |01- 31 |
[STO]0 |02- 23 0 |
[ENTER] |03- 31 |
[ENTER] |04- 31 |
[x] |05- 61 |
[STO]1 |06- 23 1 |
[CLx] |07- 34 |
[STO]2 |08- 23 2 |
[RDN] |09- 22 |
2 |10- 2 |
[f][SQRT] |11- 14 0 |
[/] |12- 71 |
[-] |13- 41 |
[g][INTG] |14- 15 32 |
[-] |15- 41 |
[STO]4 |16- 23 4 |
[x<>y] |17- 21 |
[STO]3 |18- 23 3 |
[f][x<=y] |19- 14 41 |
[GTO]35 |20- 13 35 |
1 |21- 1 |
[STO][-]3 |22- 23 41 3 |
[RCL]0 |23- 24 0 |
[RCL]1 |24- 24 1 |
[RCL]3 |25- 24 3 |
[g][x^2] |26- 15 0 |
[-] |27- 41 |
[f][SQRT] |28- 14 0 |
[-] |29- 41 |
[g][INTG] |30- 15 32 |
[STO][-]2 |31- 23 41 2 |
[RCL]4 |32- 24 4 |
[RCL]3 |33- 24 3 |
[GTO]19 |34- 13 19 |
[RCL]2 |35- 24 2 |
[ENTER] |36- 31 |
[+] |37- 51 |
[RCL]1 |38- 24 1 |
[+] |39- 51 |
[RCL]4 |40- 24 4 |
[RCL]0 |41- 24 0 |
[-] |42- 41 |
[g][x^2] |43- 15 0 |
[+] |44- 51 |
4 |45- 4 |
[x] |46- 61 |
[GTO]00 |47- 13 00 |
1 R/S -> 4 ( 2.7 s )
5 R/S -> 88 ( 3.8 s )
540 R/S -> 918,168 ( 3 min 14.1 s )
4999 R/S -> 78,528,204 ( 29 min 36.7 s )
-----------------------------------------------
Posts: 1,619
Threads: 147
Joined: May 2006
My solutions:
15C:
001 - 33 Rv
002 - 33 Rv
003 - 42, 7, 0 FIX 0
004 - 44 1 STO 1
005 - 42, 5, 1 DSE 1
006 - 22 2 GTO 2
007 - 22 3 GTO 3
008 - 42,21, 2 LBL 2
009 - 44 0 STO 0
010 - 43 11 x^2
011 - 44 2 STO 2
012 - 42,21, 1 LBL 1
013 - 45 2 RCL 2
014 - 45 1 RCL 1
015 - 43 11 x^2
016 - 30 -
017 - 11 SQRT
018 - 36 ENTER
019 - 43 44 INT
020 - 44,40, 0 STO+ 0
021 - 43,30, 6 TEST 6
022 - 42, 6, 0 ISG 0
023 - 43 7 DEG
024 - 42, 5, 1 DSE 1
025 - 22 1 GTO 1
026 - 45 0 RCL 0
027 - 42,21, 3 LBL 3
028 - 4 4
029 - 20 x
34S (as submitted at HHC2011. NOTE: I was in a hurry and directly ported over my 15C program with the exception of using CEIL.):
001 Rv
002 Rv
003 STO 01
004 DSE 01
005 GTO 02
006 GTO 03
007 LBL 02
008 STO 00
009 X^2
010 STO 02
011 LBL 01
012 RCL 02
013 RCL 01
014 X^2
015 -
016 SQRT
017 CEIL
018 STO+00
019 DSE 01
020 GTO 01
021 RCL 00
022 LBL 03
023 4
024 x
34S do-over after talking to Marcus, Ari, and reading the manual a bit more:
001 RCL Z
002 STO 00
003 STO 01
004 x^2
005 STO 02
006 SKIP 07
007 RCL 02
008 RCL 01
009 x^2
010 -
011 SQRT
012 CEIL
013 STO+ 00
014 DSE 01
015 BACK 08
016 RCL 00
017 4
018 x
The 34S is a remarkably fun machine. The SKIP and BACK functions rock.
Edited: 28 Sept 2011, 7:09 p.m.
▼
Posts: 875
Threads: 37
Joined: Jul 2005
Egan's 34s program was the winner, in the "pure RPN" category. I don't want to steal Gene's thunder, I'll let him give full results and explain the category issue. (I don't think he has posted the results yet, but I could have missed it. If so, ignore this message.)
|