HP Forums

Full Version: Program works in emulators but not on HP-15c hardware
You're currently viewing a stripped down version of our content. View the full version with proper formatting.

I have been working on a complex program for the HP-15c LE. A personal challenge of sorts. I have reached a point where the program runs correctly when I use the HP-15c PC simulator that came with the LE,and it also runs correctly in the simulator from Torsten Manz. However, it fails to give the correct answer on the HP-15c LE itself.

Unfortunately the program takes a very long time to run - many hours - and it has been difficult to identify where exactly it starts to go off the rails when running on the actual 15c hardware. For about the first 20 minutes, if I interrupt it and examine the state, it all looks OK, somewhere between 20 and 30 minutes into the computation, the state starts differing from what it should be.

The programs logic is entire based around integer math and there are no cumulative calculations where round-off error could propagate. There are however a lot of INT, FRAC, and RND operations to make the integer math work correctly.

Before I go hunting in the dark, I thought I would ask this group whether there are any known differences between the two emulators that I am using and the actual hardware that might be worth looking into as possible causes for the results I am seeing. Specifically if there are differences in how precision affects the results of the INT and FRAC instructions.

Have you checked the batteries? A fresh pair of batteries will be enough for about three hours of continuous run. Depending on their charge levels they may cause your program to fail long before that.

Hello,

hunting in the dark...

maybe it is totally not an issue of programming, maybe the cause is where else: I've no HP 15 LE, but I know that this machine is very thirsty in electrical power. You told us, that the program is running a long time, maybe the batteries aren't no more fresh enough while it's working ...

not hunting in the dark...

It would be a good idea to publish the formulars where the code depend on, the code itself and what the program have to do it.
The rest would be having time for analyzing it.

Sincerely
peacecalc

EDIT: GERSON WAS FASTER, IGNORE MY POST!

Edited: 9 May 2012, 4:11 p.m.

Quote:
GERSON WAS FASTER, IGNORE MY POST!

I think he shouldn't. Either we are both correct or we are both wrong :-)
Jeff O., who's made a detailed analysis of the capacity of the CR2032 batteries that come with the HP-15C LE, might confirm this:

http://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/archv020.cgi?read=210415

Cheers,

Gerson.

Great suggestion - thanks!

I had not thought of that but it might explain why it seemed (I did not have hard data to confirm) that on different test runs the failure point was happening at different times.

The batteries in the calculator are still the original ones from when I received it last year.

A very informative link.

Basically it means that I have a lot of optimization work left because as it currently stands I don't think my program would finish before even a fresh set of batteries ran out.

As a hobby, I've been working on and off for months just to cram it into memory. And now, another unexpected plot complication...

Marcel:

Well, I know that it is really no consolation, but in my opinion, you have succeeded in your quest!

If there is a hardware bug* that prevents you from running your simulator-verified 15C code, that is too bad, but you have accomplished the intellectual feat! Congratulations!

Bruce.

* In this case, I would call the inability to run a several-hours-long program a 'bug'.

If the batteries get too low to continue during a running program, the result will most likely be memory reset. Marcel, did your calculator experience memory loss?

Also, I'll bet some folks here would like to enter and run your program to help ferret out the issue, if you would care to post the listing.

Edited: 9 May 2012, 8:37 p.m.

I find this interesting. Do you have an external power supply to support the 15LE while your program is running? If you don't perhaps posting your program would allow someone who can put their LE on an external supply can run it and see if it then gets the right answer.

Of course, you would need to describe a test case that you're using to test your program so others can compare apples with apples.

I just had another thought. Would a DM-15cc run your program entirely on batteries alone?

Gerry


Edited: 9 May 2012, 9:18 p.m.

Bruce,

Thanks for the congratulations but unfortunately I am not quite satisfied. My original goal was not just to cram the algorithm on the 15C, but also to have it run in a reasonable amount of time and so far I have failed (quite spectacularly) on the latter.

However, now that I (think I) have managed to get the code running on the target hardware, I can focus on optimization. It was meant to be a side project to help me take my mind off work in the evenings, and as such I have no deadline.

Marcel

I don't have an external power supply for the 15c (and don't plan on putting one together). Are there people here who have one? I don't mind posting the code if someone wants to try it out.

The code is a brute force sudoku solver. Input is an unsolved sudoku and the output is a solved one. The length of time the software takes depends on the sudoku itself - as do all brute force solvers. The test cases I have been using are quite simple but any solvable sudoku will do - however some may take a very long time to solve indeed.

We are still interested in your work. It might be worthwhile to port it to other machines as well, like the 30b (untouched) or the 34S, and see how it fares. The latter has an integer mode to speed up execution time and should do better than any 15C incarnation.

For the most part, the 34S's integer mode isn't really any faster than real mode. Dealing with carry and overflow accurately consumes lots of cycles.

This is something I plan on addressing for the 43S -- two types of integers.


- Pauli

Edited: 10 May 2012, 2:47 a.m.

Quote:
The code is a brute force sudoku solver. Input is an unsolved sudoku and the output is a solved one. The length of time the software takes depends on the sudoku itself - as do all brute force solvers. The test cases I have been using are quite simple but any solvable sudoku will do - however some may take a very long time to solve indeed.

You may want to have a look at these three articles featuring and describing Sudoku programs I wrote for the HP-71B some time ago.

HP-71B Short & Sweet Sudoku Solver

HP-71B Sudoku Solver Sublime Sequel

HP-71B Sudoku Generator & Coach!

The articles are PDF documents and discuss at length the algorithms used in the featured programs, plus lots of examples and test cases that may come in handy for you to check and fine-tune your own program.

Regards from V.

Valentin.

When I undertook the project I did a little research to see if someone had tried it before and read your interesting articles.

At its root, our approaches are quite similar. However the constraints of the 15C are very very different than the 71B and it is those constraints that really dictated my approach:

Having less than 500 bytes of memory - for the program instructions and the data combined - forced me to use a very compact data representation. Unfortunately, the more compact the data representation, the larger code to access the data.

Currently, the program uses 238 bytes for all the data which I believe is about an order of magnitude less than your 71B version.

The key challenges were:

1) No bit operations on the 15C - which meant that setting and extracting bit flags required more instructions. (This is one of the key reasons for the current slowness)

2) No MOD or integer operations on the 15C - which also increased the number of instructions necessary.

3) No recursion possible; the 15c's maximum call stack is too shallow.

All of the above made it it extra difficult to squeeze the entire program into 225 or so bytes of memory available. In the end, what started out as a relatively simple design design ended up pretty ugly and inefficient, entirely because of the memory constraints.

M.

Quote:
I don't have an external power supply for the 15c (and don't plan on putting one together).

If you change your mind on this, it's simply to do. Just buy a 2 x AA battery holder with leads at Radio Shack: this is one.

and some alligator clip jumper leads, like these

Pull out the existing cells from the calculator and use the jumper wires to hook positive to the side connector in the battery compartment and negative to the bottom connector.

Two decent quality AA cells will power the calculator continuously for 150 hours.

If you need even more time, you can use 2 x D cells in
this holder.

Edited: 10 May 2012, 10:34 a.m.

Marcus,

I am just typing up the code to post.

As for other platforms: doing initial debugging of the 15C code was very difficult because every bug I found seemed to require another byte or two of memory and reworking the code to free up that byte or two caused knock-on effects. So, at some point I decided that I wanted to debug the RPN on another platform on which I did not have to worry about memory.

The only other calculator I had handy was a 33s, so the first place where I got things working was actually a 33s. I will post that code too.

The code on the 33s is a bit different because I had even fewer registers to work with, but at least I did not have to worry about code size.

It is ungodly slow. It could probably be made a lot faster on the 33s but my focus was really to validate the 15c code and not write an optimal 33s routine.

That said, the 33s code is a bit cleaner and might be a better starting base for playing around with other platforms.

M.

Quote:
When I undertook the project I did a little research to see if someone had tried it before and read your interesting articles.

Thanks for your appreciation and yes, I know that my HP-71B approach wouldn't easily translate to the HP-15C if at all.

The idea was to provide you with a number of worked-out test cases of various difficulty, some requiring recursion, some straightforwardly solvable, etc., as well as some useful ideas (bitboards, etc).

I'm glad that you were already aware of them, perhaps other people reading this thread which aren't might find the links to the articles useful.

Regards from V.

Quote:
I am just typing up the code to post.

Print-out and load/save file options would be nice features for the HP-15C Emulator. It's a good thing there is a "Check for Upgrade" option under the "Calculator" menu tab, even though it currently returns "no update is available at this time."

I have not had a chance to try out the 34s and am not familiar with its instruction set. It is a fascinating project, however, and one of these days I will try it out.

The real performance gain will be from instructions that allow bit manipulation. The 15c version simply relies on integer math because it is doing the equivalent of setting individual bits and testing them in the registers.

Any platform that provides bitwise SHIFT, OR, and AND operations will perform much better.

The other easy speedup, if bit operations are unavailable or slow, is to have a lot of memory space so that individual bit flags can be stored in integers. Right now the algorithm implemented in my 33s and 15c versions uses 324 (9*9*4) bit flags.

M.

Promise me this won't fry my calculator?

Being a software guy, I keep thinking back to the old maxim that the only thing scarier than a hardware guy installing a software patch is a software guy with a screwdriver in hand.

The last calculator operation I attempted was fixing the keyboard on my beloved 32sii and it died on the operating table.

As long as you don't hook up the wires in the wrong direction, I promise that it won't fry your calculator. I've done this (or similar) with almost every calculator I own, hundreds of them, and have yet to destroy one.

Here is the source to a sudoku solver for the 15c. On the LE it takes quite a long time to run on a typical sudoku. I have not tried it on an original 15c. While you can give it an artificially simple sudoku puzzle and see it work correctly, a reasonable puzzle taken from a newspaper, for example, can take hours to solve.

I would love if someone with a power supply on their 15c LE could try it out on a real puzzle to see if it finishes successfully. I have done virtually all my testing on non-trivial cases on the HP provided emulator.

Usage.

1) CLEAR REG

2) Store the 9 rows of the sudoku as integers in 8 through .6

eg:
480010200
STO 8
023086000
STO 9
650094030
STO .0
etc....

3) XEQ A

When it stops, the results will be registers 17 through 25. (Yes I know that that is a pain and I will change it eventually).

I will post a better documented version to Articles in the next couple of days.

Marcel


##########################################################################
# mod(x, y)
# Returns y % x. Note the RND appears necessary because of precision issues.
##########################################################################
LBL 0
/
LASTX
XY SWITCH
FRAC
*
RND
RTN

##########################################################################
# getPart(x)
# lifts x, y (loses z & t) - returns value in x - input parameter ends up in y
##########################################################################
LBL 1
ENTER
ENTER
2
6
GSB 3
RCL (i)
RTN

# imp subroutine that points the indirect register at x + reg[y]
LBL 3
+
STO I
R DOWN
RTN

##########################################################################
# setPow2()
# Preserves x, y & z, and loses t
##########################################################################
LBL 5
STO 0
1
-
2
XY SWITCH
POW
X EXCH 0
RTN

##########################################################################
# setU()
##########################################################################
LBL D
GSB 5
RCL 2
GSB B
RCL 3
GSB B
RCL 4
GSB B
RTN


# imp subroutine that does setPart(x, getPart(x) + reg[VAR_TMP1])
LBL B
GSB 1

RCL 0
F? 3
CHS
+

XY SWITCH
ENTER
2
6
GSB 3
STO (i)
R DOWN
9
+
GSB 5
RTN

##########################################################################
# putA()
##########################################################################
LBL 7
X <> 6
STO 0
RCL 2
1
7
GSB 3
RCL (i)
RCL 6
RCL 0
-
RCL 5
*
+
STO (i)
RTN

##########################################################################
# change()
##########################################################################
LBL 6
STO+ 1

RCL 1
9
/
INT
STO 2

RCL 1
9
GSB 0
STO 3

3
/
INT
RCL 2
3
/
INT
3
*
+
STO 4

8
RCL 3
-
10^X
STO 5

RCL 2
1
7
GSB 4
STO 6

RCL 2
8
GSB 4
STO 7

RTN

LBL 4
GSB 3
RCL (i)
RCL 5
/
INT
1
0
GSB 0
RTN

##########################################################################
# main()
##########################################################################
LBL A
1
CHS
STO 1

LBL 10
1
GSB 6

RCL 7
GSB 7
RCL 7
TEST 1
GSB D

8
0
RCL 1
TEST 6
GTO 10
1
CHS
STO 1

LBL E // m1
8
0
RCL 1
TEST 5
RTN // finished
1
GSB 6
RCL 7
TEST 1
GTO E
GSB 7

LBL 19 // m2
9
RCL 6
TEST 5
GTO C
1
+
GSB 7

RCL 6
GSB 5
CF 2
RCL 2
GSB 9
RCL 3
GSB 9
RCL 4
GSB 9

F? 2
GTO 19
RCL 6
GSB D
GTO E

LBL C // m3
1
CHS
GSB 6
TEST 1
GTO C
RCL 6

SF 3
GSB D
CF 3
GTO 19

LBL 9
GSB 1
RCL 0
/
INT
2
GSB 0
TEST 1
SF 2
R DOWN
R DOWN
9
+
GSB 5
RTN

Wow - #1: impressive. #2: If it takes a few hours on the 15C LE - I bet you'll be having the original 15C spend at least a half of day running the program.

Edited: 10 May 2012, 11:30 p.m.

Just to add another interesting source on the same subject - Jean-Marc Baillard´s SUDOKU solver is featured on his own web site.

Since you are using my emulator, I'd like to recommend its documentation features. You can use either the pure .15C text format or the .HTML export.

Torsten

Quote:
Right now the algorithm implemented in my 33s and 15c versions uses 324 (9*9*4) bit flags.

That would even exhaust the 112 flags in the 34S. :-O

The 34S has an integer mode with bit manipulation commands which would be helpful here. 9*4 bits fit easily in a single register so that 9 registers would be enough to store the flags if the wordsize is large enough.

If it weren't so slow, this might be task for a 16C (probably better the DM-16CC which is faster). The 16C reallocates memory depending on word size. A word size of 4 bits might be fully enough to implement the algorithm.

Do you have a pseudo code representation of your algorithm?

Half of a day will not be enough, I'm afraid.

If a "few hours" are taken to be four hours, and the 15C-LE is assumed to be 125 times faster than the 15C, then one would expect it to take three weeks! Not even the energy-frugal 15C could survive that without external power. At least the 15C low-battery shutdown would not corrupt the machine state as does that of the enhanced 15C-LE.

Marcus,

The 33s version does use the fact that you can fit 9*4 bits into a register, which is why I think it is a much cleaner version.

On the 15c, it is not that simple to access 36 bits in a register and trying to do so would slow things down and increase the code size, so I ended up using 9 more registers than I needed on the 33s. That is actually something that I am working on now - eliminating the need for those 9 extra registers. I want to use the memory freed up to put in an input and output loop, and do some speed optimizations.

The code can be *much* quicker and smaller on a machine with bit manipulation, so it might not be so slow on a 16c. Smaller code could also reduce the number of subroutine calls (which are mostly there to shrink the code size) and that would help speed it up too.

I don't have true pseudo-code, but I have a C version that I started with. There are some minor differences but it gives you a good idea of what the program does.

Unfortunately, because it is not really pseudo-code, it doesn't do a good job identifying the places where bit maniputlation would help. In other words, the integer math that replaces bit manipulation is baked in.

The 33s C code is below. I posted the 33s program earlier.

Marcel


// HP-33s version of sudoku solver. Microsoft Visual Studio 2010 dialect

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <math.h>

// C App Only - Simulated input - test cases - comment in the one you want to use
#if 0
static int start[9] = {
354000000,
0,
1700500,
800030000,
97800006,
6100090,
700000400,
8006950,
10040030,
};
#endif

#if 1
static int start[9] = {
6007000,
3005900,
40075,
40020000,
5801700,
30400060,
7062430,
0,
801000090,
};
#endif


// bit locations in the word: 36 bits used in a word.

#define BIT_X 0
#define BIT_Y 9
#define BIT_B 18
#define BIT_S 27

// The HP 15-c Registers and Flags

static bool flag[8];
static _int64 reg[26];

// definition of which Register is which variable

#define VAR_TMP1 0 // Z: temporary value used by calculations
#define VAR_I 1 // A: current index into 81 space solution array
#define VAR_X 2 // B: row of current index -
#define VAR_Y 3 // C: column of current index
#define VAR_B 4 // D: block # of current index
#define VAR_M 5 // E: power of 10 of current column in compressed form
#define VAR_CSOL 6 // F: value in solution at current index
#define VAR_CSTR 7 // G: value in starting clues at current index

#define VAR_UCOMP_START 8 // H-P: 9 elements - bit flags as to what is set
#define VAR_SOL_START 17 // Q-Y: 9 elements - input & current solution being evaluated

#define FLAG_SET 2


/************************************************************************
Not in 33s code - replaced with a RMDR operator
************************************************************************/
static _int64 mod(_int64 y, _int64 x)
{
return y % x;
}

/************************************************************************
LBL F
************************************************************************/
static _int64 getPart(int val)
{
return reg[VAR_UCOMP_START + val];
}

/************************************************************************
LBL G
************************************************************************/

static void setPart(int xind, _int64 yval)
{
reg[VAR_UCOMP_START + xind] = yval;
}

/***********************************************************************
LBL J
***********************************************************************/
static void setPow2(int val)
{
reg[VAR_TMP1] = (_int64)pow(2.0, val - 1);
}

/***********************************************************************
LBL O
***********************************************************************/
static void setU(int val)
{
setPow2(val + BIT_X);
setPart((int)reg[VAR_X], getPart((int)reg[VAR_X]) + reg[VAR_TMP1]);
setPow2(val + BIT_Y);
setPart((int)reg[VAR_Y], getPart((int)reg[VAR_Y]) + reg[VAR_TMP1]);
setPow2(val + BIT_B);

setPart((int)reg[VAR_B], getPart((int)reg[VAR_B]) + reg[VAR_TMP1]);
}

/***********************************************************************
LBL L
***********************************************************************/
static void putA(_int64 x)
{
int hoho = (int)reg[VAR_CSOL];
reg[VAR_CSOL] = x;

_int64 ov = reg[VAR_SOL_START + reg[VAR_X]];
_int64 nv = ov + reg[VAR_M] * (reg[VAR_CSOL] - hoho);
reg[VAR_SOL_START + reg[VAR_X]] = nv;

}


/***********************************************************************
LBL K
***********************************************************************/
static void change(int v)
{
reg[VAR_I] += v;

assert(reg[VAR_I] >= 0);

reg[VAR_X] = reg[VAR_I] / 9;
reg[VAR_Y] = mod(reg[VAR_I], 9);
reg[VAR_B] = reg[VAR_Y] / 3 + (reg[VAR_X] / 3) * 3;
reg[VAR_M] = (int)pow(10.0, 8.0 - reg[VAR_Y]);

reg[VAR_CSOL]= mod((reg[VAR_SOL_START + reg[VAR_X]] / reg[VAR_M]), 10);

setPow2((int)reg[VAR_Y] + BIT_S + 1);
if (mod((getPart((int)reg[VAR_X]) / reg[VAR_TMP1]), 2) > 0) {
reg[VAR_CSTR] = reg[VAR_CSOL];
}
else {
reg[VAR_CSTR] = 0;
}

}

/***********************************************************************
Not in 33s code
***********************************************************************/
static void output()
{
for(int j = 0;j < 9;j++) {
printf("%i\n",reg[VAR_SOL_START + j]);
}
printf("\n");
}


/***********************************************************************
LBL A
***********************************************************************/
int main()
{

// zero composite flags - different on 33s - just a clear
for (reg[VAR_TMP1] = 0;reg[VAR_TMP1] < 9;reg[VAR_TMP1]++){
reg[VAR_UCOMP_START + reg[VAR_TMP1]] = 0;
}

// enter data - different on 33s - user enters values
for (reg[VAR_TMP1] = 0;reg[VAR_TMP1] < 9;reg[VAR_TMP1]++){
reg[VAR_SOL_START + reg[VAR_TMP1]] = start[reg[VAR_TMP1]];
}

// reset counters to prepare for net increment to first position
reg[VAR_I] = -1;

// scan the starting data and set flags for initial values
for (int j = 0;j < 81;j++) {
change(1);
if(reg[VAR_CSOL] > 0){
setU((int)reg[VAR_CSOL]);

setPow2((int)(reg[VAR_Y] + BIT_S + 1));
setPart((int)reg[VAR_X], getPart((int)reg[VAR_X]) + reg[VAR_TMP1]);
}
}

// reset counters to prepare for solver
reg[VAR_I] = -1;

// LBL E
m1:
change(1);
if(reg[VAR_X] == 9){
output();
exit(0);
}

if(reg[VAR_CSTR] > 0) {
goto m1;
};

putA(reg[VAR_CSTR]);

// LBL X
m2:
if(reg[VAR_CSOL] == 9) {
goto m3;
}
putA(reg[VAR_CSOL]+1);

// check to see if flags are set
//isSet(reg[VAR_CSOL]);
int val = (int)reg[VAR_CSOL];
flag[FLAG_SET] = 0;

setPow2(val + BIT_X);
if (mod((getPart((int)reg[VAR_X]) / reg[VAR_TMP1]), 2) > 0)
flag[FLAG_SET] = 1;

if (flag[FLAG_SET])
goto m2;

setPow2(val + BIT_Y);
if (mod((getPart((int)reg[VAR_Y]) / reg[VAR_TMP1]), 2) > 0)
flag[FLAG_SET] = 1;

if (flag[FLAG_SET])
goto m2;

setPow2(val + BIT_B);
if (mod((getPart((int)reg[VAR_B]) / reg[VAR_TMP1]), 2) > 0)
flag[FLAG_SET] = 1;

if (flag[FLAG_SET])
goto m2;

setU((int)reg[VAR_CSOL]);
goto m1;

// LBL C
m3:
change(-1);
if(reg[VAR_CSTR] > 0)
goto m3;

val = (int)reg[VAR_CSOL];
setPow2(val + BIT_X);
setPart((int)reg[VAR_X], getPart((int)reg[VAR_X]) - reg[VAR_TMP1]);
setPow2(val + BIT_Y);
setPart((int)reg[VAR_Y], getPart((int)reg[VAR_Y]) - reg[VAR_TMP1]);
setPow2(val + BIT_B);
setPart((int)reg[VAR_B], getPart((int)reg[VAR_B]) - reg[VAR_TMP1]);


goto m2;
}

I count 199 steps. Is that correct?

Also, I assume that the POW command (at step 28 by my count) is yx, correct?


Edited: 11 May 2012, 3:38 p.m.

Yes, 199 steps. Yes, POW = yx.

Here is a link to a numbered listing:
http://dl.dropbox.com/u/12824027/suMini15cV2.pdf