Inspired by another member here to write the 8-Queens problem for the HP41 in MCODE and add this to the Benchmark I used a sprained ankle to do just that. Below you find the MCODE which models as closely as possible the Basic code. Runn-time 11.05 seconds on my normal speed CX. For a reference, the RPN version runs 17+ minutes...

I used a very simple program and the 41cx internal clock to time this so the 11.05s have a tiny bit of RPN overhead for the timing program. If there is a better way to time the actual code, please let me know.

Would be great if that can be added to the great Benchmark article on calculator speed.



; 8 queens problem - calc benchmark
; modelled after the BASIC version printed
; X = M
; Y = N
; A() is RAM-Regs 1-9 (Stack + Alpha)
;S = Q
*0078 0B2 #0B2 ; "2"
*0079 011 #011 ; "Q"
*007A 003 #003 ; "C"
007B 1A0 [CQ2] A=B=C=0 ;
007C 158 M=C ;initialize
007D 070 N=C ;initialize
007E 268 WRIT 9(Q) ;initialize
007F 228 WRIT 8(P) ;initialize
0080 1E8 WRIT 7(O) ;initialize
0081 1A8 WRIT 6(N) ;initialize
0082 168 WRIT 5(M) ;initialize
0083 128 WRIT 4(L) ;initialize
0084 0E8 WRIT 3(X) ;initialize
0085 0A8 WRIT 2(Y) ;initialize
0086 068 WRIT 1(Z) ;initialize
0087 2A0 SETDEC
0088 198 [L40_2] C=M
0089 0A6 A<>C S&X
008A 130008 LDIS&X 008 ;
008C 366 ?A#C S&X
008D 093 JNC [SL180_2] +18 009F ;If X=8 Then 180
008E 198 [L50_2] C=M ;
008F 226 C=C+1 S&X
0090 158 M=C ;X=X+1
0091 198 [L60_2] C=M
0092 270 RAMSLCT
0093 130008 LDIS&X 008
0095 2F0 WRITDATA ;A(x) = 8
0096 278 [L70_2] READ 9(Q)
0097 22E C=C+1 ALL
0098 268 WRIT 9(Q) ;S=S+1
0099 198 [L80_2] C=M ;
009A 070 N=C ; Y=X
009B 0B0 [L90_2] C=N
009C 266 C=C-1 S&X
009D 070 N=C ;Y=Y-1
;----------------------------- Stepping Stone
009E 013 JNC [L100_2] +2 00A0
009F 12B [SL180_2] JNC [L180_2] +37 00C4
;----------------------------- Stepping Stone
00A0 0B0 [L100_2] C=N
00A1 2E6 ?C#0 S&X
00A2 333 JNC [L40_2] -26 0088 ;if Y=0 then GOTO 40
00A3 198 [L110_2] C=M ;get A(x)
00A4 270 RAMSLCT
00A5 038 READDATA ;
00A6 0A6 A<>C S&X ;A(x)
00A7 0B0 C=N
00A8 270 RAMSLCT
00A9 038 READDATA ;A(y)
00AA 246 C=A-C S&X ;T = A(x) - A(y)
00AB 013 JNC (c6_2) +2 00AD
00AC 286 C=0-C S&X ;complement if underflow
00AD 2E6 (c6_2) ?C#0 S&X ;if T=0
00AE 04B JNC [L140_2] +9 00B7 ;then goto 140
00AF 0E6 C<>B S&X ;save abs(T)
00B0 198 [L130_2] C=M
00B1 0A6 A<>C S&X
00B2 0B0 C=N
00B3 1C6 A=A-C S&X ;X-Y
00B4 0E6 C<>B S&X ;get abs(T)
00B5 366 ?A#C S&X ;if X-Y <> abs(T)
00B6 32F JC [L90_2] -27 009B ;then goto 90
00B7 198 [L140_2] C=M
00B8 270 RAMSLCT
00BA 266 C=C-1 S&X
00BB 2F0 WRITDATA ;A(x) = A(X) -1
00BC 2E6 [L150_2] ?C#0 S&X ;if A(x) <>0
00BD 2CF JC [L70_2] -39 0096 ;then goto 70
00BE 198 [L160_2] C=M
00BF 266 C=C-1 S&X
00C0 158 M=C ;X=X-1
00C1 198 [L170_2] C=M
00C2 2E6 ?C#0 S&X ;if x <>0
00C3 3A7 JC [L140_2] -12 00B7 ;then goto 140
00C4 35C [L180_2] R= 12
00C5 278 READ 9(Q)
00C6 2FC (c8_2) RCR 13 ;=LSHFT 1
00C7 2E2 ?C#0 @R
00C8 3F3 JNC (c8_2) -2 00C6
00C9 0E8 WRIT 3(x)
00CA 3E0 RTN


Congrats! Impressive result. I need to learn MCODE. :-)


Thanks for this impressing implementation of the test code. ;-)

I'm not sure about the correct CPU description and the CPU clock

speed for adding into the benchmark. Is "Nut @ 0.355 MHz" right?


The nonpareil source code has it at 375200 Hz. Another will have to confirm.


Egan, Xerxes, thanks for the kind words, having the BASIC template available helped a great deal.

According to the 'Detailed Description of the CPU' document (p4, see TOS, Internal Documentation), the NUT processor runs at 340-360khz in the HP41 (and 200-230khz in the HP11C and 12C, which I did not know!)



I've updated the benchmark list with the rounded down value of 11.0 seconds due to your comment about the RPN overhead for timing.
Thanks for your nice contribution.


Very nice indeed! Did you already have a look at Raymond's contribution: HP-48GX / Saturn Assembly?

He uses the nibbles of a register for the array. You will find a documented listing at the end of this thread:
Calculator Benchmark 48GX/hp48xgcc and 50g/HPGCC3 results




Actually I had not. It is very neat indeed, a true 'RDT'!

My first version used apparently the same idea, that is storing the array in nybbles. However that version took 42.1 seconds as compared to this version due to some shortcomings of the nut instruction-set. \



