▼
Posts: 304
Threads: 32
Joined: Nov 2005
I did some benchmarks for the HP-41CL as described in the Calculator Benchmark article. You'll find the results in PDF format here. As expected the MCODE benchmark shows a linear speed-up because there is no display output (the 41CL has to switch back to original speed for some I/O operations). With keystroke programming in TURBO50 mode the 41CL is as fast as a Commodore 64, programmed in BASIC. With MCODE in TURBO50 mode, the 8-Queen problem is solved in 1/4 sec!
▼
Posts: 136
Threads: 7
Joined: Jun 2007
I was curious to see the difference of the speed modes since discovering the 41CL some times ago. Thank you for your effort and the nice contribution.
▼
Posts: 304
Threads: 32
Joined: Nov 2005
I have to thank you for updating the article.
▼
Posts: 67
Threads: 5
Joined: Apr 2011
Even though "TurboX" means "normal speed", TurboX could be misleading to the unaware!
My propose instead of:
17:58 HP-41CL Keystroke / RPN / TurboX Mode
to change to:
17:58 HP-41CL Keystroke / RPN / Turbo off
▼
Posts: 136
Threads: 7
Joined: Jun 2007
I think you are right, but in the list "Turbo" usually stands for speed up by hardware modification,
so I changed it to "TurboX Mode x1.0" to make it more clear what TurboX means.
Posts: 239
Threads: 9
Joined: Dec 2010
Hi Xerxes,
Here's a couple of additions to your nice benchmark doc, in, I hope, ready-to-paste format:
- 4.36 ND1 (v1.4) RPL (UserRPL from HP-50g)
-
- 2.78 ND1 (v1.4) RPL+
-
- 0.00338 ND1 (v1.3.9) JavaScript
-
==========================================================================================
RPL+:
-------
<< 8 =r
0 =:s =:x =y
[] =a
DO
r =a[++x]
DO
++s
x =y
WHILE y 1 > REPEAT
a[x] a[--y] - =:t
IF 0 == t ABS x y - == OR THEN
0 =y
WHILE a[x] -- =:a[x] 0 ==
REPEAT
--x
END
END
END
UNTIL y 1 == END
UNTIL x r == END
s
>>
==========================================================================================
JavaScript:
[taken almost verbatim from C code]
-------------------------------------
function() { /*as is*/
var r=8, s=0, x=0, a = [];
do{
a[++x]=r;
do{
++s;
var t, y=x;
while(y>1)
if (!(t=a[x]-a[--y]) || x-y==Math.abs(t)){
y=0;
while(!--a[x])
--x;
}
} while(y!=1);
} while(x!=r);
return s;
}
Edited: 21 June 2011, 7:54 a.m.
▼
Posts: 136
Threads: 7
Joined: Jun 2007
Hi Oliver,
thank you for this interesting comparison, but if I'm right the ND1 is an App and the speed depends on the used device.
Please consider that the benchmark is especially for physical calculators and calculator like pocket computers of the past.
I think comparing software or emulated calculators needs an own benchmark list with tests on different hardware.
Thank you for your understanding.
▼
Posts: 239
Threads: 9
Joined: Dec 2010
Hi Xerxes,
Yes, ND1 is an app. I understand what you're saying and did notice that almost all results were for HW calcs. I saw a C64 result in there, which encouraged me to suggest this addition anyway.
But that's ok, I understand.
(I'll reapply after I figure out how to get a JavaScript VM running on an HP-30b. Ok, that's a joke. I think.)
Cheers.
▼
Posts: 136
Threads: 7
Joined: Jun 2007
:)
The C64 stands for the missing Panasonic HHC with the Microsoft Basic ROM, because I suspect an equivalent speed. AFAIK there was also Snap Basic and Snap Forth available for the HHC.
Posts: 67
Threads: 5
Joined: Apr 2011
very nice contribution, Thank you both!
Posts: 3,229
Threads: 42
Joined: Jul 2006
Since we're revisiting this benchmark. The WP 34S runs it in 2.3 seconds in real mode and 2.1 in integer.
The program is the same either way:
001: LBL B
002: CLREG
003: 8
004: STO 11
005: RCL 11
006: x=? 00
007: SKIP 22
008: INC 00
009: STO ->00
010: INC 10
011: RCL 00
012: STO 09
013: DEC 09
014: RCL 09
015: x=0?
016: BACK 11
017: RCL ->00
018: RCL- ->09
019: x=0?
020: SKIP 05
021: ABS
022: RCL 00
023: RCL- 09
024: x<>? Y
025: BACK 12
026: DSZ ->00
027: BACK 17
028: DSZ 00
029: BACK 03
030: RCL 10
031: RTN
- Pauli
▼
Posts: 136
Threads: 7
Joined: Jun 2007
The fastest keystroke programmable! Thank you for testing.
Posts: 239
Threads: 9
Joined: Dec 2010
10x faster than a HP-12C ARM in RPN and 40x (!) faster than a 50G in UserRPL? Wow!
I guess RPL's poor showing comes from this code being more about interpreting control structures, than computing.
▼
Posts: 3,283
Threads: 104
Joined: Jul 2005
Quote:
10x faster than a HP-12C ARM in RPN
I was thinking that using SKIP and BACK would be responsible for the speed advantage of WP 34S over other designs but the 12C ARM is using the same hardware and direct addressing, no labels. Pauli must have done something right, I guess.
EDIT: Thinking twice, isn't the 12C ARM based on an emulation layer that mimics the old voyager processor and runs the original firmware almost untouched? This would explain why it's slower than a native implementation.
Edited: 20 June 2011, 4:07 p.m.
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
Quote: I was thinking that using SKIP and BACK would be responsible for the speed advantage of WP 34S over other designs but the 12C ARM is using the same hardware and direct addressing, no labels.
I haven't tested but the long backward jumps might be faster using a GTO/LBL pair. I suspect that it is the distance of search that is important since both BACK/SKIP and GTO load every instruction from program memory. The LBL instruction executes very rapidly since it doesn't even call a worker routine.
If I could think of a better way to handle errors we'd get a fairly nice speed up. The current method saves the stack and volatile state before executing every instruction and restores it if an error occurred. This is quite expensive time wise but it allows a complete restoration with a minimum of code. The memory copy routine is optimised for space not speed which will also hurt a bit here.
Likewise, I've got lots of checks for illegal op-codes in the instruction dispatch & execution paths. Take these out and we'd get a small speed up. However, I'm not going to since the chance of an error causing havoc would increase too much.
The 12c is emulating the old (NUT?) processor.
Quote: Pauli must have done something right, I guess.
I hope I've done more than one thing right in the firmware :-) Minimising the instruction decode/execute overhead was deemed desirable from the start.
- Pauli
Posts: 239
Threads: 9
Joined: Dec 2010
I've collected a few benchmarks by now, and compared UserRPL speed to JavaScript. With this benchmark, the speed difference is 800x, when it normally is ~20x.
The ratio from ND1 to 50g is ~20x for UserRPL, and ~30x for RPL+ vs. UserRPL, in line with usual results.
I conclude that this benchmark, so far, is an outlier / worst case for UserRPL.
An internal build runs the RPL+ code in 0.008 seconds (that is, a whopping 300x faster than current ND1, and ~10,000x faster than HP-50g), employing code-morphing to JavaScript. I don't have implemented this yet fully, but this nice result provides some motivation to push ahead with this work.
So, thank you for this worst-case-for-RPL benchmark... ;-)
Posts: 1,545
Threads: 168
Joined: Jul 2005
I must admit yet again. 2.1 seconds. Wow.
▼
Posts: 3,229
Threads: 42
Joined: Jul 2006
It could be faster if we had the code space :-)
- Pauli
|