I think without major rework, it's probably about as fast as it's going to get. The issue is that looking at the Saturn CPU architecture, it's not a good match for the underlying ARM emulation architecture meaning translation is probably quite expensive time-wise. Firstly the registers in the Saturn seem to be 64-bit whereas the ARM is a 32-bit CPU. Secondly, not all the Saturn's registers will fit inside the ARM CPU's registers resulting in a lot of copying to and from RAM. Thirdly, the Saturn is technically a clever 4-bit BCD based CPU whereas the ARM is much more general purpose and purely aimed at basic 32-bit integer and non BCD floating point.
You could probably get some mileage with dynamic translation i.e. on the fly compilation of Saturn code into ARM. I doubt it does that already as it's damn hard to do such things.
You can already get calculators with native 132MHz ARM CPU, 64Mbytes of RAM, 128Mbytes of flash, Li-Ion Cell, 320x240 colour screen, programming capability and extensive CAS:
TI N-Spire family
I own the left and right hand devices in that picture. People love them so much I was given them and never use them other than to show people why they are horrible.
They are about as unusable as you could possibly make a device. They have so many abstractions over things (like structured documents) and have literally acres of UI before you can get to what you want that they are useless.
You can't for example, whack together a one liner program to calculate parallel resistances and then just apply it to the stack as you are calculating. You have to create a document, then create a program, then type the whole damn thing in via a menu system which is deeper than the ocean, then navigate back to your problem, then have to call it manually (no menu reference!). Ick. It's like using Windows 3.1 again.
Anyway anti-NSpire rant aside, if you put a 1GHz+ ARM with all that stuff in it, it would probably fly but it would also eat batteries in hours rather than weeks.
I think someone needs to build a new standard "RPN/RPL calculator virtual machine" which is kinder to mainstream CPUs than the Saturn architecture. If it's written in C, it could in theory be portable to native 50g hardware, HP 30b hardware and NSpire hardware (all ARM!) with the software environment above the VM abstraction being the same.
Anyone fancy biting that one off?
Anyway must go and do some work now :(