Login

fhub · 10-28-2011, 08:13 AM

In my Polynomial RootSolver there are a few expressions in an iteration loop of the form r = s - a*b -c*d.

I see 3 possibilities to calculate this expression:

1) without any tricks in the given form s - a*b -c*d

2) using complex multiplication (cpx*) in the following way: first a*b + c*d (complex) and then subtract from s

3) using cpx* twice: 1*(1*s-a*b) - c*d, i.e. first calculate 1*s-a*b and then 1*result-c*d

Currently I'm using method 2), i.e. (a*b+c*d) complex and then s - (result).

Now I would like to know if method 3) would give a higher precision than 2), or if even none of both complex tricks is better than the standard way 1) ?

Method 1) of course needs less (internal) calculations (no real+imaginary part!), so if there's no difference than I would switch back to 1).

And then one more (principle) question: is there any difference between the precision (i.e. used/stored digits of a number) of the stack and the registers?

I guess there's none (?), but then I don't really see the advantage of using cpx* : what is an internal precision of 34 digits worth, when I only have 16 digit numbers on the stack?

Franz

Edited: 28 Oct 2011, 8:15 a.m.

Marcus von Cube, Germany · 10-28-2011, 08:49 AM

Franz, we have two types of number inside the machine: registers and internal decimals. The former is of type decimal64 (or a variant thereof) which packs 16 significant digits into 8 bytes of storage. The internal format is decNumber which we normally use at 39 digits but sometimes to an even higher precision. Whenever an internal result is moved to a register, it is rounded according to the current rounding mode.

All registers are alike, 00 to 99 and all the lettered registers.

The accuracy of your proposed methods depends on whether cancellation can occur or not. This is always the case when a subtraction of numbers of similar magnitude is done. The most accurate way would be to do all calculations with the internal precision. Your proposal does this at least for one of the two subtractions but it may still be that this is less accurate then desired due to cancellation in the second step. At least one intermediate result is always stored with 'just' 16 digits. You can, of course, define two matrices of dimension 3x1 and 1x3 and multiply these. This will be performed to the internal precision. It's only worth the effort when you know that the result will always be close to zero so that cancellation is an issue.

fhub · 10-28-2011, 09:17 AM

Thanks Marcus, that was quite detailled! :-)

The problem is that I don't know the values of all those variables in these formulas, so I have no idea if such critical cancellations may occur or not. And I'm not keen on testing also these values, after I've already made hundreds of tests with random polynomials.

So I guess I should indeed rather remove these cpx* and use the usual calculation method instead, because these iterations are done quite often and cpx* needs twice as many operations.

Franz.

Paul Dale · 10-28-2011, 06:29 PM

Nice wrap up. I saw this come in just as I was heading to bed last night and decided I was too tired to answer it properly :-) Anyway, Marcus said pretty much what I was going to.

One thing that wasn't bought up is that the matrix multiply option is worth further consideration (i.e. multiply [ s a c ]^T by [ -1 b d ]. This is probably the most accurate way to perform the calculation since everything is done in internal precision. It also doesn't incur as much overhead as the complex multiplies will. It does, however, require the values in numbered registers and the matrix descriptors loaded -- store the descriptors in registers and RCL them or better RCLS them.

The matrix multiply might even be faster than doing the computations on the stack -- you'll be trading a multiply by -1 for the stack operations and lots of conversions from register to internal format and back. The actual arithmetic is all carried out in internal precision either way (although the long results could be a fraction slower for this). At least, it is seems to be worth a test or three to be sure.

- Pauli

fhub · 10-29-2011, 06:21 AM

Quote:
At least, it is seems to be worth a test or three to be sure.

From the viewpoint of accuracy you're right of course, but using this matrix multiplication would require storing these variables in a special register order, and then all the other calculations (and there are many!) won't work anymore (because they also need a special order for complex multiplications).

And it would dramatically increase the program size (and I'm already at more than 300 steps).

The main problem of my polynomial solver is not accuracy, the problem is that the used method (Bairstow) doesn't converge fast enough (or even not at all) for (some) higher degree polynomials - at least not on slow hardware like the 20b/30b. (1000 iterations on the PC emulation are no problem, but on the 20b even 100 already need about 10 seconds).

The program is already working (almost) perfectly for polynomials with degree upto 10, but with increasing degree (20-30) it happens quite often that the iteration doesn't converge, no matter what methods for initial guesses I use (and I've tried lots of them!).

So I'll continue to find better ideas for polynomial degrees higher than 10 or 15 ...

Franz

Paul Dale · 10-29-2011, 06:46 AM

Hmmm, sorry I can't think of anything to suggest here :-(

- Pauli

fhub · 10-28-2011, 04:41 PM

Marcus, one other question:

Do you have any idea how a WP34s program could find out whether it is running on a real HP-20b/30b or in the PC emulator?

The only thing that I could imagine would be using a short loop of any commands and measure the time with TICKS. But what commands to use and how often to repeat the loop, so that at least 1-2 TICKS are over on a real 20b (whereas on the emulator it would be of course 0 TICKS)?

With this method I could use different default values in the program for a 20b and the emulator.

Or is there any other trick I could use for this purpose?

Franz

Edited: 28 Oct 2011, 4:42 p.m.

Marcus von Cube, Germany · 10-28-2011, 05:20 PM

Just measure the ticks consumed by your program and terminate the loop depending on the elapsed number of ticks instead of a fixed counter. We don't have a command to identify the emulator and even if we had, you never know how fast the underlying system is.

Login
Username:
Password:	Lost Password?
	Remember me

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	HP Prime numerical precision in CAS and HOME	Javier Goizueta	5	2,369	11-16-2013, 03:51 AM Last Post: Paul Dale
	ILPer with "auto-extended addressing"	Christoph Giesselink	0	966	07-23-2013, 03:28 PM Last Post: Christoph Giesselink
	[wp34s] Converting to/from IEEE 754 Binary64 (Double Precision)	David Maier	1	1,112	06-15-2013, 09:21 PM Last Post: Paul Dale
	Precision DC-50	Mic	3	1,499	05-26-2013, 01:27 PM Last Post: DavidM
	Estimating accuracy in finite precision computations	mpi	17	4,269	02-22-2013, 09:44 AM Last Post: mpi
	[WP34S] WP34S firmware on the AT91SAM7L-STK dev kit?	jerome ibanes	1	1,179	10-04-2012, 04:59 PM Last Post: Paul Dale
	50G precision & accuracy	Matt Agajanian	11	2,978	05-17-2012, 11:15 AM Last Post: Crawl
	DM-15CC Extended Memory	Mark Hardman	4	1,598	02-17-2012, 12:18 AM Last Post: Paul Dale
	WP34S V3 : STATUS questions	Miguel Toro	4	1,621	12-30-2011, 11:12 AM Last Post: Marcus von Cube, Germany
	WP 34S: Double precision registers - What are your feelings?	Marcus von Cube, Germany	16	4,293	12-26-2011, 10:49 AM Last Post: Dieter