Since the conference is well and truly over, I thought I'd post my solutions and see what other people have come up with and what other improvements are possible.
Let the discussions and optimisations begin :)
 Pauli
HHC 2011 Programming Puzzle SPOILERS

09262011, 02:14 AM
Since the conference is well and truly over, I thought I'd post my solutions and see what other people have come up with and what other improvements are possible. Let the discussions and optimisations begin :)
09262011, 02:15 AM
For the WP 34S.
A basic routine that runs over each value of X in the positive quadrant and
001 [cmplx]DROP Ignore centre
This program takes 6.5 seconds for a radius of 999. For a radius
 Pauli Edited: 26 Sept 2011, 2:17 a.m.
09262011, 02:15 AM
For the WP 34S. Again running of the value of x and solving the circle equation but in integer mode results in a faster program.
001 [cmplx]DROP Ignore centre
Timings in this case are 7.6 seconds for a radius of 5,000 and 1.4
One problem here. The integer square root code is incorrectly
It should be possible to rectify the problem with the square root  Pauli
Edited: 26 Sept 2011, 2:29 a.m.
09262011, 02:15 AM
For the WP 34S.
Rather than solving the circle equation each time, an incremental
001 [cmplx]DROP Ignore centre
Running time for a radius of 999 is 0.9 seconds, for a radius of
 Pauli Edited: 26 Sept 2011, 2:18 a.m.
09262011, 02:17 AM
Some other semirandom comments.
 Pauli
09262011, 03:38 AM
HI, I come first to an algorithm real close to your code. Here is a version for HP41C:
01 LBL"HHC2011 13  25 + Some detail perhaps needs an optimization. For RPL user, here a version for HP28S calculator, which is basic enough to be used with out modification on any RPL system.
« > r x y @ input three arguments as specified Only one quadrant of the circle is investigated, total number of pixels is 4 times the quadrant count. I may suggest that a version using only an halfquadrant may speed up the process and is coherent with symmetry. The total number of pixels will be 8 times one halfquadrant count.
@ version 2.0 Here a proposed code for HP15C and any other great classic HP calculators.
#  Comments and suggestions are welcome.
Edited: 26 Sept 2011, 6:28 a.m. after one or more responses were posted
09262011, 03:49 AM
Interesting. I did do just an octant for the Bresenham approach. This algorithm generates just an octant of a circle and relies on symmetry to get the rest of the points on the circumference. Rather than keeping track of the diagonal, I accumulated the total area under the octant, doubled this and removed the duplicate area which is a perfect square. This allowed everything to be kept in integer mode.
09262011, 08:11 AM
Hi C.Ret, Quote:At least it can be done shorter and faster. ;)
Here's basically the same algorithm with some improvements. R was moved under the root and a more elegant CEIL emulation was used. For positive arguments, ceil(x) equals int(x + sign(frc(x)). Due to the special signimplemenation in the 41series an additional test for nonzero x was required. All this makes the program shorter and faster. Add a label and rtn if you prefer. ;) 01 RDNFor r > 1 steps 06, 07 and 24 can be omitted. The next improvement is an update of the user instructions, telling the user not to enter the center coordinates but just the radius. This would save two more steps. ;) Dieter
09262011, 08:22 AM
Here's mine. I hope there are no other HP12C program so I can win in the fastest HP12C program category :) Cheers, Gerson.
09262011, 08:23 AM
I haven't written a program for the 41 since 1987 when I got my 28C, but I can't pass up a good contest. I dusted off my calculator and manual and here's what I came up with:
01 LBL"PC Program Contest (Surely there's a better way to do the CEILING function. ???) This program would have been even more interesting if we did not assume that the circle fit on the display panel. Also, about the idea of only counting pixels that are at least 50% covered, is there an easy way to do that without integrals? Wes L
09262011, 08:35 AM
Quote:I've found this approximation empirically from my results: n ~ r*(pi*r + 4) Gerson.
Edited: 26 Sept 2011, 8:37 a.m.
09262011, 09:45 AM
Hello
Thank you for the much more elegant CEIL way : n INT Lastx FRAC SIGN +(will run whith any classic except special cas of SIGN of HP41). I was trying to adapt my code for Hp15c, but I was unable to lacate any SIGN function. May the sequence be used instead : FIX 0 n RND? I will try it later and spare steps in HP15 listing as you demonstre it to me. Considering only a quadrant was a good idea. Considering only an octrant will be a good one toot, if I had not made thinks the worst way. By scaning from 0 to r/sqrt(2), I spare about 30% of running time. But, I only realize that it is possible to spare 70% of the loops by scanning k from (r1) to r/sqrt(2) ! And using symetry to count this small octrand part two time and adding the "common" square. A picture better than words : « > r x y @ input three arguments I hope this last approache will be the more efficient, especially for large circles, concerning number of loops. Now 5000,0,0 HHC2011c returns result in less than 1'25" on HP28S
Edited: 27 Sept 2011, 12:17 p.m. after one or more responses were posted
09262011, 10:34 AM
Quote:At least there's another option (as well as that using SIGN mentioned in other posts). ;) The '41 features a MOD function, so a true CEIL function can be setup like this: ENTERThat 1 may be stored somewhere beforehand to increase execution speed.
Once again, tricks like these are not required on the 35s since it features an INTG function (equivalent to FLOOR elsewhere): +/In our case, instead of adding CEIL(x) we can simply subtract INTG(x): ... Dieter
09262011, 10:58 AM
Quote:Or also ENTER Quote:Well, the 41 was the first HP (pocket) calculator I know of that featured a SIGN function. So none of the classics (and many later models as well) offered that function and this approach cannot be used there. Quote:There you are. ;) Quote:No, this won't work. It simply rounds the argument to the next higher or lower (!) integer. CEIL always rounds up. But since sign(x) = x / abs(x) for x<>0, it could be accomplished this way: ENTERThis works for x >= 0.
And finally, a more tricky version: ENTERThis also works for negative x. Ah, those were the good old days where calculators were so slow that using transcendental functions didn't matter much. ;) Dieter
09262011, 12:30 PM
40 bytes for Hp 42S (or HP 41) and 2 registers, rouding ok
01 * LBL "Z"Steps with *** can be removed for r > 1
Less register for 42 bytes, HP 42 or 41, rounding ok 01 * LBL "Z" Only stack !!! only for HP 42 and 44 bytes (if we count a reg as 7 bytes it's a winner: now the rounding is correct :) )
01 * LBL "Z"
( FIX 00 RND doesn't work :( need to work a bit more harder. Rounding for the 2 first programs is ok now :) And for the third also :) Olivier
Edited: 26 Sept 2011, 3:21 p.m. after one or more responses were posted
09262011, 02:21 PM
The first two versions will not work correctly. The CEIL function here was implemented by adding 0,5 and rounding to the next integer. This is basically the same as a simple 1 + INT. That's why the routine causes as error if the argument of the CEIL function is an integer. For instance, for r = 5 there are two points where the sqrt is exactly 3 resp. 4. At that point adding 0,5 will cause the value to get rounded up to 4 resp. 5 (instead of leaving it at 3 resp. 4). So the returned result is not 88 (correct), but 96 instead. Dieter
09262011, 03:21 PM
You were right, I see the bug just after posting. The correction is now done for the three programs :) Olivier
09262011, 03:37 PM
Paul, I was aware of this method. It was used in the Microsoft BASIC in my old MSX 8bit computer. Great to see an implementation, even if the results are slightly diffent. Gerson.
From the MSX Red Book: _{  Address... 5BBDH This is the circle mainloop. Because of the high degree of symmetry in a circle it is only necessary to compute the coordinates of the arc from zero to fortyfive degrees. The other seven segments are produced by rotation and reflection of these points. The parametric equation for a unit circle, with T the angle from zero to PI/4, is: X=COS(T) Y=SIN(T) Direct computation using this equation, or the corresponding functional form X=SQR(1Y^2), is too slow, instead the first derivative is used: dx  = Y/X dy Given that the starting position is known (X=RADIUS,Y=0), the X coordinate change for each unit Y coordinate change may be computed using the derivative. Furthermore, because graphics resolution is limited to one pixel, it is only necessary to know when the sum of the X coordinate changes reaches unity and then to decrement the X coordinate. Therefore: Decrement X when (Y1/X)+(Y2/X)+(Y3/X)+... => 1 Therefore decrement when (Y1+Y2+Y3+...)/X => 1 Therefore decrement when Y1+Y2+Y3+... => X All that is required to identify an X coordinate change is to totalize the Y coordinate values from each step until the X  150  5. ROM BASIC INTERPRETER coordinate value is exceeded. The circle mainloop holds the X coordinate in register pair HL, the Y coordinate in register pair DE and the running total in CRCSUM. An equivalent BASIC program for a circle of arbitrary radius 160 pixels is: 10 SCREEN 2 20 X=160:Y=0:CRCSUM=0 30 PSET(X,191Y) 40 CRCSUM=CRCSUM+Y :Y=Y+1 50 IF CRCSUM<X THEN 30 60 CRCSUM=CRCSUMX:X=X1 70 IF X>Y THEN 30 80 CIRCLE(0,191),155 90 GOTO 90 The coordinate pairs generated by the mainloop are those of a "virtual" circle, such tasks as axial reflection, elliptic squash and centre translation are handled at a lower level (5C06H).  }
09262011, 05:03 PM
I also figured out a couple of approximations but they aren't good enough :( On the 34S using integrate results in a very fast and approximate answer:
001 [cmplx]DROP An equivalent program on the 15C LE caused the integrator to run for ages  I gave up after quite a few minutes.
09262011, 05:13 PM
Quote: I'm sure the algorithm can be corrected to give the correct results. It is a matter of reworking the maths and changing the update part of the loop. I might do this if I get really keen, but I've already spent more time than I can afford on this interesting little challenge.
09262011, 05:46 PM
That was not a criticism. Actually, the way the circles are drawn have been simplified to make the challenge easier, as Walter has noticed. I guess both algorithms produce smoother circles, therefore no need to change them. The point is the existence of such algorithms, which were created in a time when speed really mattered. Gerson.
09262011, 06:16 PM
No criticism was taken :) Definitely stick to integer mode and avoiding expensive operations is a win although a lot less so that when these algorithms were first discovered.
09262011, 09:25 PM
These are all great, and I thank all of you for your interest in this problem. I plan to writeup the winning entries from the conference as well as including the entries here. I will need a bit of time to key them in and time them with my sample input values for comparison to the conference results. For what it is worth, this contest was open to all RPN machines. I now have a solution running on the HP 67. :) It is a bit slow...
09262011, 11:12 PM
to a radius of 4999 in about 1.4 hours. :)
78,528,204. Edited: 27 Sept 2011, 12:28 p.m. after one or more responses were posted
09272011, 01:32 AM
Hi C.Ret, My solution for the HP15C also used the optimization that you posted, where it is only necessary to find the number of pixels in ~30% of the lines. The total number of pixels is then 4 * (s1 + 2 * s2) where s1 = (INT(r/sqrt(2)+1))^2 and s2 is the number sum of the number of pixels on each line above the square on the main diagonal as shown in your picture. Lines 29 to 34 are yet another variant on the CEIL function. For the radius=5000 case on the HP15C LE the program below computes the number of pixels in 16 seconds. Best regards, Eamonn.
001 LBL C
09272011, 02:48 AM
Here is an HP41 program with your idea (take R in X)
01 *LBL'Y *** step are for R=1 case (can be removed) for the rounding in 'old' classic you can use
INT as a plus it does not perturb the stack too much I use this on an HP97 :)
Olivier Edited: 27 Sept 2011, 2:51 a.m.
09272011, 03:35 AM
A faster one (shorter loop no R=1 special case, R still in X)
01 *LBL'Y (53 bytes)
Some timings 50 > 8024 in 8 sec HP42S one, better use of stack and recall arithmetic
01 *LBL"Y" sorry, no timing ...
Edited: 28 Sept 2011, 6:58 a.m. after one or more responses were posted
09272011, 07:42 AM
Quote: Anyone going to try on the HP 25 or 10C :)
09272011, 08:02 AM
The basic approach is similar to my suggestion for the '41, so both can be improved this way: Instead of starting the loop with x = r  1 (which is done by the first DSE) let's begin with x = r. This will not change the result since the first loop will add sqrt(r^2  r^2), i.e. a plain zero. But it can handle the case r = 1 without any additional code, thus saving four steps, while on the other hand one more loop has no perceivable effect on execution time. It even handles r = 0 correctly. ;)
Here are two versions of this algorithm. It's very basic and for sure not optimized for speed, but I like it for its compact code and readability. Especially the 35s version. ;) HP41/42 HP35sR is assumed in to be entered X. Add 2x RDN if you want to do it according to the original rules. Dieter
09272011, 11:41 AM
To reduce time, this one can be faster ...
001 STOI *** I Index register on a 97 it should be around 50 minutes for 4999
Just timed it it is more like 30 minutes :) (but a 32SII is 3 minutes with the same program ...) BUG on HP97: don't work for R < 3 : ISZI peculiarity : changes
change old 21 by it is not needed for HP32SII (use ISG instead of ISZ)
Edited: 28 Sept 2011, 4:17 a.m. after one or more responses were posted
09272011, 09:37 PM
Here's the second version of my 12C program. No attempt to optimize it for least numbered registers usage has been made. Actually, I should be doing my homework instead :)

09272011, 09:47 PM
Quote: What about the HP33C? Under half an hour, even less if the loop is optimized to use the stack whenever possible. Gerson.

09282011, 04:09 AM
A stack only version for hp42s
00 { 58Byte Prgm } start with a default state (as said) so flag 0 reset A faster one for HP41
01>LBL "W"
Edited: 28 Sept 2011, 6:08 a.m.
09282011, 05:24 AM
Faster one ? (shorter loop, need timing)
001 ENT^
Edited: 28 Sept 2011, 11:51 a.m.
09282011, 03:04 PM
I did it slightly differently, with about the same timings:
09282011, 03:25 PM
A better one for 42S (shorter loop)
01 >LBL "X"
09282011, 05:37 PM
Another RPL solution :
« DROP2 > r '4*Sum(n=0,r1,CEIL(Sqrt(1(n/r)^2)*r))' »DROP2 is just to ignore x and y. Sum and Sqrt are the special symbols for Very simple but takes ~ 75s. for 5000 radius in aprox mode
Edited: 28 Sept 2011, 5:49 p.m.
09282011, 07:03 PM
My solutions: 15C:
001  33 Rv 34S (as submitted at HHC2011. NOTE: I was in a hurry and directly ported over my 15C program with the exception of using CEIL.):
001 Rv 34S doover after talking to Marcus, Ari, and reading the manual a bit more:
001 RCL Z
The 34S is a remarkably fun machine. The SKIP and BACK functions rock. Edited: 28 Sept 2011, 7:09 p.m.
09282011, 09:41 PM
Egan's 34s program was the winner, in the "pure RPN" category. I don't want to steal Gene's thunder, I'll let him give full results and explain the category issue. (I don't think he has posted the results yet, but I could have missed it. If so, ignore this message.)
09292011, 12:29 AM
HP41CX, HP42S and wp34s versions

09292011, 02:01 AM
42S on iPhone 3.1.3 brings:
5000 > 78,559,640 > 0.5 sec
;) Edited: 29 Sept 2011, 2:03 a.m.
09292011, 07:38 AM
41C 37 bytes
Well Gene, that was a good one. Endless fun.
09292011, 04:21 PM
Very nice idea to count the non lit pixels ... PS: Werner, you have a strange 41C, timing your program on my 41CX give me 13.6 sec for 100, 1min10 for 540 and 10min52 for 5000 ?? Some variation HP41
{54 BYTES} a bit faster : 540 in 1min02, 5000 in 9min26 For HP42, not timed but the inner loop is shorter :)
00 {47 BYTES} apply the same correction to get good result for R=1,2,3
Edited: 29 Sept 2011, 4:49 p.m.
09292011, 05:31 PM
Salut Gilles, A better solution :
« DROP2 > r
This version spare execution time for large radious r.
09292011, 08:03 PM
Quote:Same idea, but less steps, and all stack (and I think a bit faster): 001 RCL ZInstead of using y=sqrt(r^2x^2), I moved the center of the circle to 0,r and used y=sqrt(x*(2rx)). This made it easier to use DSE to catch the edge between the yaxis and the r/sqrt(2) square. I have not read all the other solutions, so this may have been covered.
09292011, 09:26 PM
Quote: Very nice! It's a bit faster indeed:
540 R/S > 918,168 (* 1.1 s ) (*) with a chronometer Or, using the builtin TICKS command, 8 and 71 ticks, respectively, which multiplied by my unit's correction factor 1/9 gives 0.9 s and 7.9 s. Mine takes 8 and 75 ticks, respectively. A builtin quartz would be handy.
001 LBL A
Quote: Same here, except for Werner's solution, from which I borrowed the idea of checking r*(1  1/sqrt(2)) columns instead of r/sqrt(2). I wrote my first draft as soon as I saw Gene's post and stuck to my first idea. I've computed the dark pixels too, so I never needed the CEIL function.
Gerson. Edited: 29 Sept 2011, 9:30 p.m.
09292011, 09:36 PM
Quote: On a crystal unit, it would be 7.1 s.
TICKS remains constant regardless of the correction factor.
09292011, 09:39 PM
Quote:Sounds ominous. :) Kinda like dark matteryou cannot see it, but you can measure it, or so it is argued.
09292011, 09:43 PM
I tried to create a BASE 10 version, but got some errors, perhaps it is the SQRT bug you mentioned. Perhaps you can give it a try. It should run in about 1s. Edited: 29 Sept 2011, 9:44 p.m.
09292011, 10:37 PM
I've not reflashed my 34S for ages so it still has the sqrt bug. The bug, when it occurs, gives a result 1 too high and only when carry is set. So a sequence that checks carry and if it is set squares the number and compares against the original... Or reflash with the latest from the sourceforge site and get a working integer SQRT.
09302011, 09:33 AM
Quote:RTFS (Read The Fine Sources): In a non crystal equipped calculator, the PLL runs at a higher multiplication factor and the interrupts are scaled up slightly to compensate for a slower R/C clock:
if ( Xtal ) {The code is executed from the LCD refresh interrupt occurring every 640 slow clock cycles (roughly 50 Hz). TICKS are counted when the UserHeartbeatCountDown value reaches zero. In a R/C environment, the frequency is assumed to be about 15% slower. Why all this? The slow clock outputs a 32KHz nominal clock but the rate is typically 15 to 20% less then that without the crystal. The PLL multiplies this value to derive the processor clock. I use higher values for the PLL when no crystal is installed. The periodic interrupt is not controlled by the PLL but directly by the 32KHz oscillator through the LCD controller and needs a separate correction algorithm. In short: The TICKS results should be similar with or without a crystal but they typically don't exactly match. Also the resulting execution speed should be roughly the same between devices with or without the crystal.
10012011, 02:15 PM
My CV died a few years ago, so I rely on the excellent simulator found on TOS. I had no idea the timings were off, though. Cheers, Werner
10022011, 10:17 PM
This is my third (and last) attempt. More steps, one register and slightly slower than yours.
001 LBL C
10032011, 12:58 AM
Quote: Next to last, I mean :) Nothing but the stack. Well, it looks like Egan's, only somewhat less efficient. Currently I cannot experiment with CLx (which perhaps could be useful inside the loop) because it does not disable stack lift (outdated firmware here).
Thanks, Egan, for the free RPN programming lesson!
10032011, 02:25 AM
Quote: Antepenultimate attempt, that was! A little trick to save a sum inside the loop (steps 10, 11, then final multiplication by 8 instead of 4). It's possible this has already been used, however.
001 LBL C
10042011, 02:50 AM
Stack only but for a real HP made by HP (software too) ;)
00 {48 BYTES} non working for R=1,2,3 correction in steps not numbered (add some step, but don't slow the inner loop) The trick was not to use CEIL but INT (see other posts, counting 'black' dots)
Edited: 4 Oct 2011, 2:58 a.m.
10042011, 03:42 AM
some remarks:
1.replace RCL ST Xby ENTERsaving a byte 2.you need a CF 00 instruction at the beginning, of course, bringing the byte count to 10 to cater for the exceptions. 3.replace RCL ST Zwith (shorter by one byte): RCL ST Z
Here's my 41C version, with only 8 bytes overhead (but a bit more stack handling as the 41 does not have register recall arithmetic) *LBL"PIXELS"Cheers, Werner
10042011, 08:08 AM
Thanks for the optimization, but no need for CF 00, you start the program in a default state (see the puzzle post) so it's useless ;) Olivier
10042011, 01:10 PM
RPL stack only (HP50g):
%%HP: T(3)A(D)F(,); It uses the same formula in my last wp34s program above, probably not the best one in this case, both speedwise and sizewise.
Quote: In order to fix the same problem in the first version of the RPL program above, I simply moved the test to the beginning of the loop, by replacing DO UNTIL with WHILE REPEAT. "Endless fun", as Werner already said. Gerson.
10052011, 01:16 PM
I have been fooling around with the challenge since Sept. 25. I think I have reached the point of diminishing returns, or at least should stop spending time on it. So I'll present some results, maybe that will stop the obsessive tweaking. Using the inscribed square reduction depicted here (which I did develop on my own on Sept. 26 before looking at any of the solutions), the following listing is about the best I can come up with for wp34s. It returns 78,528,204 for a radius of 4999 in about 8.20 seconds.
001 LBL B Summing the area in the stack appears to be a popular option, so I created the listing below. It is one step longer, and also seems to run eversoslightly slower, returning the result for 4999 in about 8.35 seconds.
001 LBL B Observations  I don't claim the above to be the best possible, just likely the best I can do with reasonable effort. I have of course also worked on versions for the 15c LE, the latest of which returns the answer for 4999 in about 19 seconds. I may have to keep working on that one... ...
Edited: 6 Oct 2011, 8:42 a.m. after one or more responses were posted
10052011, 03:05 PM
I posted this a while back, but it has be archived: 001 RCL ZGerson has been building on it and timing it, but I think it runs in less than 8 s for r=5000. To make the programing smaller and to also deal with small r I moved the circle origin to r,0 and then used y=sqrt(x*(2rx)). That with a bit of gymstacktics help reduce the number of instructions and the use of x^2. I would like to get it down to 1 sec with the use of integer mode, however my firmware still has the isqrt bug and I cannot flash (tried 3 different machines and 3 different OSes) until I get a machine with a real serial port. Edited: 7 Oct 2011, 10:12 a.m.: Changed 0,r to r,0 in description
Edited: 7 Oct 2011, 10:12 a.m. after one or more responses were posted
10052011, 08:05 PM
You can use the builtin TICKS command instead of a chronometer. The latter adds about a 200 ms delay, no matter how fast you think you are (most likely your program takes 8 seconds or less for r=5000 as well). For times in seconds, you have to find the correction factor unless your wp34s has a builtin crystal. Or simply compare the number of TICKS returned by program A to determine which program is faster.
001 LBL C 001 LBL DI wasn't able to time your programs because my wp34s lacks DSL (it needs a firmware update). Gerson.
10052011, 08:32 PM
With a crystal installed, taking the difference of two TICKS calls should be fairly accurate. Without a crystal, the difference of two TICKS calls will still returns the same duration, it just won't necessarily match with real time so well.
10052011, 09:21 PM
Quote: I recall seeing it, but did not study it closely. Not much fun to use somebody else's idea(s) to improve your own program, after all.
Quote: I get about 8.2 seconds, which is virtually the same as my versions. Not too surprising, since even though your method is more elegant, it still requires the same number of loops.
Quote: Are you sure that you did not move the origin to r,0? Either way, a clever approach.
Quote: I am running v2.2 1674. In BASE10 integer mode, when I take the square root of a nonperfect square, it returns the square root of the largest perfect square less than the original argument. Is this the behavior you are after? I tried a kludgy modification to your code to use this, and it returns 78,528,204 for a radius of 4999 in about 1.9 seconds. Unfortunately, it also returns the wrong answers for radii that produce perfect squares at certain points in the process, so it needs work. But it is fast! ...
Edited: 6 Oct 2011, 8:55 a.m. after one or more responses were posted
10052011, 09:55 PM
Quote: Try checking the carry flag (C). This is cleared if the square root was of a perfect square and set otherwise  a very handy feature of integer mode.
10052011, 10:13 PM
I figured there must be a feature to make this work. I'll give it a try.
10062011, 12:03 AM
Quote:Typo, yes, I moved it to r,0.
10062011, 12:04 AM
Change: SQRTto: SQRT
10062011, 02:17 AM
I have a useful modification in mind: CEIL in integer mode should add the carry and clear it thereafter. Pauli, any ideas?
10062011, 02:20 AM
Quote:
The way it's implemented the two scenarios might differ. I tried to get TICKS right in either environment but the accuracy is dependent on the actual R/C frequency. Edited: 6 Oct 2011, 2:20 a.m.
10062011, 02:51 AM
An interesting idea. CEIL is already defined in integer mode but doesn't do anything very interesting. I'm not sure CEIL is the right command for this. A pair of add carry and subtract carry instructions would be more general and meaningful I think. We'd really do best by having add with carry and subtract with borrow instructions.
10062011, 03:37 AM
My idea was to make CEIL behaving similarly in DECM and integer modes. If all functions that would normally return a non integer result (such as 1 ENTER 2 /) set the carry, CEIL would be working the same in both environments. ADC and SBB seem natural for integer mode. I will have to learn ARM assembly to port this stuff to native ARM code.
10062011, 03:46 AM
Quote: I believe most operations that can return a fractional result do set carry appropriately. If any don't (and the 16C did), please let me know. The problem here is that there are plenty of other ways to set it. Addition and subtraction set it for overflow & borrow and shifts and rotates set it as well. As things stand CEIL does work the same in both integer and floating point modes  it rounds up to the nearest integer. In integer mode this means the number doesn't change which is correct.
Quote: Do this and we'll save more than a bit of space in the integer support  simulating the CPU flags in C is a huge consumer of code space. My ARM assembly is passable but not great.  Pauli
Edited: 6 Oct 2011, 3:47 a.m.
10062011, 07:45 AM
Yes, that is what I did for the second use of SQRT. For the first, I think you need to change: 2
Your program, with the above changes, now looks like this: 001 BASE 10
The above clocks in at right about 2 seconds for 4999. Edited: 6 Oct 2011, 8:07 a.m.
10062011, 11:28 AM
If 019 STO+ Zis changed to 019 SL 1Does it help? Edited: 6 Oct 2011, 11:28 a.m.
10062011, 12:27 PM
It is getting hard to say since I am timing things with my running watch, but it seems to get the time for 4999 down to about 1.9 seconds. ...
10062011, 05:49 PM
Shifts are faster than addition or multiplication in integer mode. Dealing with carry and overflow consumes lots of cycles. The 2 / sequence could be replaced with SR 01 which is shorter and faster.
Edited: 6 Oct 2011, 5:49 p.m.
10072011, 10:07 AM
Pauli, ... 
« Next Oldest  Next Newest »
