HP Forums

Full Version: Three new Datafile articles on-line
You're currently viewing a stripped down version of our content. View the full version with proper formatting.

Hi all,

    At long last I've updated my calc web site to include three of my recent Datafile articles, now available on line in PDF format, and of course free.

    They're as follows:

    • HP-71B: Fantastic FOUR

        8-page article, which discusses how you can use Fast Fourier Transforms (FFT) to multiply two very high-precision numbers faster than with any other algorithms currently known. Includes a very short, 10-line subprogram (can be formatted to just 8 lines) which will accept two numbers up to 1024 digits long and ouput their product (up to 2048 digits long). Running time is nearly linear in the size of the inputs compared to quadratic times for the usual 'school' algorithm (for instance, multiplying two 512-digit numbers using FFT is already 4 times faster on the HP-71B than with the usual method). Examples included.

    • Mean Matrices

        11-page article, which introduces some new specially designed matrices I created myself ("Albillo Matrices", AM for short :-) ), which are extremely troublesome to handle for the purposes of system solving, matrix inversion, or determinant computation. Unlike Hilbert matrices, which can't be represented exactly in finite floating point models, my matrices include only small, 2-digit positive integer values, yet even for small dimensions (7x7, say) they're nearly intractable by anything short of exact, arbitrary-precision arithmetic. Examples are given and discussed in full detail, and exact Mathematica-produced results are also included for comparison purposes.

    • Long Live the HP-16C !

        5-page article, which commemorates that wonderful "Programmer's" model, the Voyager-series HP-16C. It includes a short and interesting program that highlights some unique functionalities of this model. HP-71B recursive and non-recursive versions are provided as well for the sake of comparison.

    If you're interested in any of them, you can find and download them at the link above, hopefully they'll make nice HP-calc-related reading for this weekend. Any and all comments welcome. Enjoy !

Best regards from V.

Allow me to wax enthusiastic about Long Live the HP-16C!. To use a vulgar colloquialism, I've been hanging out for this article since you first mentioned it a couple of years ago. I found it to be quite entertaining--just the thing for a cold, rainy autumn evening. It proved helpful too since it revealed a flaw in my 16C simulator. Thank you.

Cameron

Hi, Cameron:

    Thanks for your kind feedback, much appreciated. I'm very glad you like the HP-16C article, it's small but neat! :-)

    I'm also glad that it proved useful to you in pinpointing that flaw in your emulator, there are so few HP-16C programs out there that getting enough samples to thoroughly test emulators is a major task in itself.

Best regards from V.

You are welcome Valentin.

I have just finished uploading a new release of the simulator to my HP-16C tribute site. This version provides implementations of both GSB I and GTO I. The program in Valentin's article now runs correctly (and swiftly) in the simulator (as can be seen in this picture of my fingers holding my battered iPAQ).

There is a minor typo in Valentin's listing. The mnemonic on line 56 should read "g RTN" instead of "g X<>(i)".

Cameron

PS: Valentin, I emailed you the details of the typo (using the BBS email link). It occurs to me that your spam filter probably discards emails that don't have a magic phase in the subject line. Apologies if you missed it.

Edited: 2 May 2006, 7:31 a.m.

In an earlier thread Rodger Rosenbaum noted that the LSQ routine was another way to solve linear equations on the HP-49. I ran your original problem on the HP-49 using LSQ. The following table compares the result with those obtained with other methods::

     INV A * B             B/A             LSQ                       RREF

-1.076 -0.6472 0.994696139952 1.00148007239
0.999786 1.00110019 1.00000066865 0.999999813411
0.407 -0.7415 0.995270976887 1.00131966086
0.144 0.6554 0.995243958251 1.00132720057
0.999124 0.9996398 0.999996718314 1.00000091517
0.553 1.1592 1.0047559686 0.998672819845
0.61 3.5385 1.0099505863 0.997223232158

The LSQ results are only slightly degraded from the RREF results, and much better than the results from using INV A * B and B/A. I also used the LSQ routine to solve the modified problems which Rodger proposed; i.e., (1) with B(1) changed from 45.3 to 45.4, (2) with B(3) changed from 45.0 to 45.1 and (3) with A(3,7) changed from 6.7 to 6.8, A(7,7) changed from 6.1 to 6.0 and B(3) changed from 45.0 to 45.1 . Finally, because your discussion mentions solutions with Hilberts I ran a 7x7 sub-Hilbert with all ones as the vector and a modified 7x7 sub-Hilbert where each element of the matrix was multiplied by 360360. The following table compares the RMS relative errors for the various problems solved with the various routines
  Problem          INV A * B            B/A            LSQ              RREF

Original 0.906 1.327 5.274E-03 1.472E-03

1 1.094E-03 1.065E-03 7.979E-04 8.281E-05

2 1.082E-03 1.082E-03 7.863E-04 8.162E-05

3 1.573E-10 5.477E-13 7.560E-13 0

Sub-Hilbert 1.343E-04 1.343E-04 1.333E-04 1.334E-04

Modified sub- 1.593E-08 1.129E-08 2.178E-07 6.956E-08
Hilbert

What this all says to me is that if I was really going to use the HP-49 to solve linear equations I would use the LSQ routine because It is more convenient. It yields results comparable to those from the RREF routine, but accepts input in the matrix/vector format that I am used to using and delivers the results in a vactor form rather than in the augmented matrix form which comes from the RREF routine.


.

Quote:
In an earlier thread Rodger Rosenbaum noted that the LSQ routine was another way to solve linear equations on the HP-49. I ran your original problem on the HP-49 using LSQ. The following table compares the result with those obtained with other methods::
     INV A * B             B/A             LSQ                       RREF

-1.076 -0.6472 0.994696139952 1.00148007239
0.999786 1.00110019 1.00000066865 0.999999813411
0.407 -0.7415 0.995270976887 1.00131966086
0.144 0.6554 0.995243958251 1.00132720057
0.999124 0.9996398 0.999996718314 1.00000091517
0.553 1.1592 1.0047559686 0.998672819845
0.61 3.5385 1.0099505863 0.997223232158

The LSQ results are only slightly degraded from the RREF results, and much better than the results from using INV A * B and B/A. I also used the LSQ routine to solve the modified problems which Rodger proposed; i.e., (1) with B(1) changed from 45.3 to 45.4, (2) with B(3) changed from 45.0 to 45.1 and (3) with A(3,7) changed from 6.7 to 6.8, A(7,7) changed from 6.1 to 6.0 and B(3) changed from 45.0 to 45.1 . Finally, because your discussion mentions solutions with Hilberts I ran a 7x7 sub-Hilbert with all ones as the vector and a modified 7x7 sub-Hilbert where each element of the matrix was multiplied by 360360. The following table compares the RMS relative errors for the various problems solved with the various routines
  Problem          INV A * B            B/A            LSQ              RREF

Original 0.906 1.327 5.274E-03 1.472E-03

1 1.094E-03 1.065E-03 7.979E-04 8.281E-05

2 1.082E-03 1.082E-03 7.863E-04 8.162E-05

3 1.573E-10 5.477E-13 7.560E-13 0

Sub-Hilbert 1.343E-04 1.343E-04 1.333E-04 1.334E-04

Modified sub- 1.593E-08 1.129E-08 2.178E-07 6.956E-08
Hilbert

What this all says to me is that if I was really going to use the HP-49 to solve linear equations I would use the LSQ routine because It is more convenient. It yields results comparable to those from the RREF routine, but accepts input in the matrix/vector format that I am used to using and delivers the results in a vactor form rather than in the augmented matrix form which comes from the RREF routine.

.


Palmer, I'm surprised you're still getting the result you do for the B/A method on the original system. On my HP48G and HP49G+, I get the exact result:

[[ 1 1 1 1 1 1 1 ]]T

Have you tried making sure that both the A matrix and B vector are type 3 objects before doing B/A?

Also, here's a little program to accept B and A on the stack and do the RREF solution with the result a column matrix on the stack. It's a little different than an earlier version in another post. The other one assumed that the matrix was 7x7; this version works for any size matrix, although the B vector must have the same number of rows:

SRREF

<< DUP SIZE OBJ-> DROP -> R C << SWAP C 1 + COL+ RREF

C 1 + COL- SWAP DROP R 1 2 ->LIST RDM >> >>

I would beware of drawing conclusions about which of these methods is most accurate as a result of its performance on Valentin's highly ill-conditioned problem. The condition number of his A matrix is about 3E12, and most of the methods for solution are working down in the noise with this matrix.

Try reducing the condition number by a factor of 10 by changing the {1 2} element of the A matrix from 7.2 to 7.1 (in the original system), and the {1 1} element of B to 45.2; the change to B is to keep the exact solution as [[ 1 1 1 1 1 1 1 ]]T. Then solve the system by the various methods and compare results.

Reduce the condition some more, by a factor of about 473, by changing the {6 2} element of A from 7.2 to 7.1, and changing the {6 1} element of B from 24.9 to 24.8, then solve this system by the various methods and compare results.

I would also recommend that instead of computing the RMS error as you've been doing, compute the (Euclidean) distance of each solution vector from the exact result (which is [[ 1 1 1 1 1 1 1 ]]T) by putting [[ 1 1 1 1 1 1 1 ]]T on the stack (with a solution vector on the stack above it), and pressing - ABS. This is the norm of the error (difference), a measure more standard in linear algebra calculations of this sort.

Rodger was right. When I applied the \->NUM thing to my matrices and reran the B/A solution for Valentin's original problem I got all ones. The following table is a correct version of the table I presented earlier in this thread where the only change is the ones for the B/A solution. .

     INV A * B        B/A           LSQ                RREF

-1.076 1. 0.994696139952 1.00148007239
0.999786 1. 1.00000066865 0.999999813411
0.407 1. 0.995270976887 1.00131966086
0.144 1. 0.995243958251 1.00132720057
0.999124 1. 0.999996718314 1.00000091517
0.553 1. 1.0047559686 0.998672819845
0.61 1. 1.0099505863 0.997223232158

The LSQ results are only slightly degraded from the RREF results. Both the LSQ and RREF results suffer in comparison with the B/A result once I do the \->NUM thing. The INV A * B result is distinctly inferior.

I also used the LSQ routine to solve the modified problems which Rodger proposed; i.e., (1) with B(1) changed from 45.3 to 45.4, (2) with B(3) changed from 45.0 to 45.1 and (3) with A(3,7) changed from 6.7 to 6.8, A(7,7) changed from 6.1 to 6.0 and B(3) changed from 45.0 to 45.1 . Finally, because Valentin's discussion mentions solutions with Hilberts I ran a 7x7 sub-Hilbert with all ones as the vector and a modified 7x7 sub-Hilbert where each element of the matrix was multiplied by 360360 . The following table compares the RMS relative errors for the various problems solved with the various routines

  Problem        INV A * B        B/A          LSQ          RREF

Original 0.906 0 5.274E-03 1.472E-03

1 1.094E-03 1.098E-03 7.979E-04 8.281E-05

2 1.082E-03 1.082E-03 7.863E-04 8.162E-05

3 1.573E-10 5.477E-13 7.560E-13 0

Sub-Hilbert 1.343E-04 1.343E-04 1.333E-04 1.334E-04

Modified sub- 1.593E-08 1.129E-08 2.178E-07 6.956E-08
Hilbert

where the only changes relative to the values in the earlier table in this thread are the zero for the B/A solution to Valentin's original problem and a change from 1.065E-03 to 1.098E-03 for the B/A solution to Rodger's first modification.

Rodger suggested that I change from calculating the rms relative error to the norm. I could do that, but I don't think I will at this time. All of my previous calculations of comparative capability of a broad range of machines and software going back to the early 1980's used rms relative error as the measure of capability. To change now would put me in an apples to oranges comparison situation unless I went back and redid all the old results. I probably won't be doing any more HP-49 work unless I get a real HP-49 which at current prices is a budget-buster for me. I do think the series of threads on the HP-49 was a worthwhile learning experience for most of us.

I don't propose to blindly use Valentin's linear equation problem as the sole measure for selection of a solver. I do say that, from a practical standpoint, if one solver can solve his problem successfully, and another cannot, then in the absence of other problems which yield contrary results I would probably decide to use the solver which can solve his problem successfully. Such a solver may be an overkill for many problems; however, such a solver may provide some comfort.

Valentin writes that work with Hilberts leads to a situation in which meaningful comparisons cannot be made. I disagree. If one goes back to old threads and looks at the determinant solutions for Hilberts and for Valentin's mean matrices, and list the various machines and software in order according to accuracy of the result then you will find that the order using Hilberts is very similar to the order using mean matrices. The symmetry of the Hilberts and the large differences in the magnitudes of the elements of the Hilberts mirror the condition of the matrices which occur in classical least squares solutions.

Palmer said:


I don't propose to blindly use Valentin's linear equation problem as the sole measure for selection of a solver. I do say that, from a practical standpoint, if one solver can solve his problem successfully, and another cannot, then in the absence of other problems which yield contrary results I would probably decide to use the solver which can solve his problem successfully. Such a solver may be an overkill for many problems; however, such a solver may provide some comfort.

What I'm trying to say is that the results we're getting from the various methods of solution of this highly ill-conditioned problem are mostly noise and accidents, and we really can't draw conclusions about the performance of any of them as a result of testing them on Valentin's matrix. None of them is trustworthy with such a high condition number.

Consider this: the original A matrix has the number 7.2 as its {1 2} element. If you add 1/90 to this element, the matrix becomes exactly singular. Notice that 1/90 expressed as a decimal is .01111111...; this means that if you simply append the numeral 1 to the 7.2 repeatedly, the modified matrix's determinant approaches zero more and more closely as you add 1's. Here is a table of determinants and condition numbers for the A matrix with a succession of 1's added to the {1 2} element. These values were computed with Mathematica using very high precision; the HP49G+ in approximate mode can only get a few digits right for the condition number.

   1,2 element   Exact Determinant      Condition Number

7.2 1/1E7 3.17397549713E12

7.21 1/1E8 3.16183337880E13

7.211 1/1E9 3.16061908862E14

7.2111 1/1E10 3.16049765882E15

Now, an interesting thing happens. For the case where the {1 2} element of the A matrix is 7.211, and the {1 1} element of the B matrix is 45.311, the LSQ method gets a solution of [1 1 1 1 1 1 1]T, but the B/A and RREF methods get a substantially worse result. Shall we therefore conclude that the LSQ method is much better than the others? No. It is just an accident, numerical happenstance, so to speak, that we get an exactly correct result with the LSQ method for a system with a condition number 100 times larger than the original system.

The same thing happens with the B/A method when applied to the original problem. The fact that it gets [1 1 1 1 1 1 1]T is just an accident. If we transpose the original A and change B to [19.5 58.7 36.3 35.9 47. 33.3 28.9] and solve with B/A, we no longer get [1 1 1 1 1 1 1]T. Why not? It seems like we ought to; it's the same size problem, with the same condition number. It's because the result we get when we solve a system with a condition number of 3E12 (or greater) is contaminated with a large amount of noise, and sometimes a result may be very close to the exact solution just by chance.

If we want a valid determination of the performance of the various methods, we need to solve many systems of similar condition number, so that we aren't fooled by an accidental good result. If we do this, we will find that the HP49G+, like any computing device, is bound by some rules. The rule of thumb is that a matrix computation will lose LOG10(k) digits of accuracy, where k is the condition number of the problem. The INV(A)*B method is the worst, because the inverse is returned to the stack with its elements rounded to 12 digits, and then this reduced accuracy inverse is used to multiply B. The other methods keep intermediate results to 15 digit accuracy, and only round to 12 digits when delivering the final result.

For the case where the {1 2} element of A becomes 7.2111, the HP49G+ thinks the determinant is exactly zero. Make the {1 1} element of B equal to 45.3111. The B/A method fails; the LSQ method returns a result which is mostly error, as does the RREF method. The internal COND command fails when applied to this modified A matrix. The SVD command fails to get the smallest singular value right (make sure flag -54 is set when using the SVD for these sorts of problems). Here, the condition number is about 3E15, and even the internal 15 digit calculations can't do the job.

But, the SVD method can still solve the system, since the smallest singular value is discarded for the solution.

My point is, when the condition number of the system is very high, on the order of 10^N, where N is the number of digits used for the calculations, the ordinary methods of solution just don't work. But the SVD method which I described in other posts still works. I even gave an example of a system whose A matrix was exactly singular, and showed that the SVD method gave a reasonable solution.

In an earlier post, I suggested:

Try reducing the condition number by a factor of 10 by changing the {1 2} element of the A matrix from 7.2 to 7.1 (in the original system), and the {1 1} element of B to 45.2; the change to B is to keep the exact solution as [[ 1 1 1 1 1 1 1 ]]T. Then solve the system by the various methods and compare results.

Reduce the condition some more, by a factor of about 473, by changing the {6 2} element of A from 7.2 to 7.1, and changing the {6 1} element of B from 24.9 to 24.8, then solve this system by the various methods and compare results.

If one does this, the various methods will be seen to give comparable results (except for the INV(A)*B method, as I explained above). Keep in mind that if you solve a system with a condition number of k (with k<10, let's say), you should only expect to get, nominally, 15-k accurate digits (on the HP49G+; other calculators may use other than 15 digits internally) in your results. This means that you just aren't going to get a very good solution for a problem with a condition number of 3E12 on the HP49G+; if you do, it's an accident, and it doesn't mean that the method which accidentally got a good result is the best method for solving linear systems in general; or, for solving systems with high condition numbers. For that, one should use the SVD method.

If you want to get the mathematically exact result (which may not have much validity if it's a series of measurements from a physical system), then you will just have to use a system with more digits, such as the HP49G+ in exact mode, or one of the high accuracy programs on the PC, such as Derive, Mathematica, Maple, etc.

Quote:
Here is a table of determinants and condition numbers for the A matrix with a succession of 1's added to the {1 2} element. These values were computed with Mathematica using very high precision; the HP49G+ in approximate mode can only get a few digits right for the condition number.

<..>

The SVD command fails to get the smallest singular value right (make sure flag -54 is set when using the SVD for these sorts of problems).

Rodger:

- the condition number is usually defined as the product of the column norms of A and INV(A); COND itself will find an estimate for that, which is all that is needed anyway. Using the ratio of the largest and smallest singular value yields a different, but comparable value (meaning it can be used as a measure for the condition, too)- but is much more computationally expensive to compute.

- flag -54 is not referenced in SVD or SVL (it is in RANK, that uses the results of SVL)

Best Regards,
Werner

Werner says:

- the condition number is usually defined as the product of the column norms of A and INV(A); COND itself will find an estimate for that, which is all that is needed anyway. Using the ratio of the largest and smallest singular value yields a different, but comparable value (meaning it can be used as a measure for the condition, too)- but is much more computationally expensive to compute.
- flag -54 is not referenced in SVD or SVL (it is in RANK, that uses the results of SVL)

I don't know about "usually", these days. Some people may "usually" define it that way, but I prefer not to. The MatrixNorm(A,2) function in Mathematica, for example, computes it as the ratio of the largest to smallest singular value (the matrix 2-norm).

As for "computationally expensive", this reason is given in numerous texts that are some years old. In Strang's book while talking about norms and condition number, he indeed says, "There is not time to solve an eigenvalue problem...". Maybe not when he wrote the book, but I don't notice much delay when I hit the enter key until I get the result in Mathematica; and the delay on the HP49G+ is quite tolerable. Computation has become inexpensive. Or, as Prof. Kahan said, "Electrical engineering has triumphed over mathematical analysis"; meaning, computers have become very fast and have lots of memory nowadays. For example, to solve a least squares problem, the method of normal equations is traditionally not recommended because it squares the condition number. But when you have mathematical software that can use hundreds of digits in the computation and still give the result in milliseconds, the squaring of the condition number is not much of a problem.

A condition number estimate is given as Norm(A)*Norm(Inverse(A)), where any norm can be used and you will get an estimate of the CN. If you use the spectral norm, you get the "best" value of the CN; use of the other norms for computing the CN can greatly overestimate the condition number. The use of the spectral norm is "best" in the sense that this condition number gives exactly the maximum "magnification factor" for the A matrix multiplying an arbitrary vector; that's the reason for my preference.

The AUR description of the COND command says: "COND uses the 1-norm and computes the condition number of the matrix without computing the inverse". I wonder just what he does. If you start with the matrix I described where the {1 2} element is 7.2111 and note that the DET is calculated as zero, and COND returns an infinite result, and swap columns 2 and 4, you now have a matrix whose determinant is no longer calculated as zero and where COND no longer fails. It almost looks like he is computing the determinant for use in COND.

The book, "Numerical Methods and Software", by Kahaner, etc.,
section 3.8, "Estimating the condition number", says "Computing A^-1 would be more expensive than solving the system of linear equations in the first place, and so this is undesireable." Waiting a few hundred milliseconds after hitting the enter key is expensive (Mathematica on a fast modern computer) or several seconds (on the HP49G+) is expensive? Give me a break!! I suppose if you are solving systems with thousands of variables it might matter (and even then it only takes a few minutes), but that's not what I'm doing.

They go on to say, "Instead, we will attempt to estimate Norm(A^-1) using far fewer calculations." The method they describe doesn't use the determinant directly, but rather the LU decomposition. I'm guessing that the HP48G probably does this and that's why COND returns an infinite result when the determinant is zero.

Interestingly, the case where the {1 2} element of the A matrix is 7.2111 is the only one where the determinant is computed as zero. If you add more 1's to the 7.2111, the determinant is once again computed as a non-zero value. It stays at about -7.9E-11 for the next 3 1's added, then becomes about -1.19E-10 for the next two, then back to -7.9E-11 for the next one. Total garbage.

As for flag -54, the AUR says on page C-5:

-54 Tiny Array Elements

Clear: Singular values computed by RANK (and other commands that compute the rank of a matrix) that are more than 1*10^-14 times smaller than the largest computed singular value are converted to zero.

In Donald LaTorre's book, "Linear Algebra Investigations with the HP48G/GX", he mentions that RREF gives different results depending on the setting of flag -54. I had not realized that before I read it there. RREF is not a command that computes the rank of a matrix (not explicitly, anyway). Do you know of any other commands that compute the rank of a matrix, implicitly or explicitly?

Anyway, I would rather look at the singular values myself to estimate rank, and I was afraid that there might be some undocumented behavior that would affect calculations involving the SVD, so I just recommend leaving flag -54 set.

I think you have looked at the actual code for a number of functions in the HP48G; have you ever examined the SVD code in detail to see if there is any undocumented behavior?

Rodger, you are right that the condition number using singular
values is the most accurate in that it measures the maximum
magnification. I was merely giving some background info.
It is not necessary to compute the (a?) condition number to 12
digits of precision to estimate how bad a particular matrix is.
As for how 'computationally expensive' that is, of course that does
not apply to 7x7 matrices. When you solve a set of linear
equations, you want an idea of how accurate your results are.
Some problems yield known well-conditioned matrices, some yield bad
ones (and should be avoided), in most cases you don't know and need
an estimate of the condition number. It would not do to compute an
SVD for all these cases, just because 'computers are fast and have
lots of memory'.

COND uses LU factorization, then proceeds to solve a system it
dynamically changes to produce the worst result possible. It does
not use the determinant's value - but if that turns out to be
exactly zero, it means there's a zero on the diagonal somewhere and
linear system solving results in divide-by-zero - hence the
COND 'Infinite Result' message. Perhaps in this case the 'fiddling'
as performed by the 15C would've been appropriate, depending on the
status of flag -54. COND's algorithm is described in
Higham, Nicholas J., "FORTRAN Codes for Estimating the One-Norm
of a Real or Complex Matrix, with Applications to Condition
Estimation",
ACM Trans. Math. Soft., Vol 15, No. 4, Dec 1988, pp 381-396.

Which brings us to flag -54. As I said, RANK uses it, but SVD/SVL
do not. I was not aware RREF used it? (I have never used RREF myself).

Yes, I have looked at quite a bit of the code in detail (at the
time the 49G was developed), there's 20k of sysRPL/Machine Language
code there for the linalg routines, most of it follows LAPACK
pretty closely (like zero-shift QR etc). The 48G contained a bug in
the eigenvalue routines, for a 5-by-5 Hilbert matrix the result of
EGV/EGVL/SCHUR/SRAD was 'Undefined Result'. I corrected it, the 49G
gives the correct result.
The COND routine also had a bug, if you tried to perform COND on
[ [ (1,0) (1,0) (1,0) (1,0) ]
[ (0,1) (1,0) (-1,0) (1,0) ]
[ (0,-1) (0,-1) (2,0) (2,0) ]
[ (1,0) (1,0) (2,0) (1,0) ] ]
the 48 did a warmstart. It, too, has been corrected for the 49G
(though not by me).

Thanks for all the interesting articles,
Werner

Werner says:

Rodger, you are right that the condition number using singular
values is the most accurate in that it measures the maximum
magnification. I was merely giving some background info. It is not necessary to compute the (a?) condition number to 12 digits of precision to estimate how bad a particular matrix is.
.

Whether is is necessary or not depends on just how accurate one wants the "estimate" to be, doesn't it?

For example, given the matrix:

[[  5  -8  -9  -7  -4 ]
[ 5 5 -8 3 1 ]
[ 3 -2 -6 8 -1 ]
[ 1 -7 9 -5 -2 ]
[ -9 9 -4 -7 -5 ]]

The condition number computed as the ratio of largest to smallest singular value is ~7.54, whereas COND returns ~20.27. We didn't even get 1 digit of accuracy out of COND. And this is just a matrix returned by RANM; I could construct even worse examples. If this accuracy is acceptable to a particular person, then they should use COND.

On the HP49g+, one doesn't have any choice about how many digits are computed for an estimate of condition number; it's 12 digits, period.

As for how 'computationally expensive' that is, of course that does not apply to 7x7 matrices.

And it doesn't apply to much larger matrices than even fit in the HP49G+. It takes 4.7 seconds to compute a full SVD of a 1000x1000 random matrix of 16 digit floating point numbers on my laptop. And if I just want the 2 norm based condition number, that only takes .5 seconds. Of course, even if the human race invents "quantum computers", it still will be possible to find a matrix sufficiently large to overwhelm it. But matrices that just 10 years ago would have been too large to consider computing the SVD routinely are not too large now.

The size of matrices that would come up in typical engineering problems are dealt with in fractions of a second on a PC. Even the HP49G+ can quickly solve systems that are so big that you don't really want to type them in. The infamous Longley data set:
http://www.itl.nist.gov/div898/strd/lls/data/Longley.shtml
takes only 30 seconds to compute the full SVD on the HP49G+. Just imagine trying to do that on an HP41. The phrase "computationally expensive" just doesn't mean what it once did.

When you solve a set of linear
equations, you want an idea of how accurate your results are.
Some problems yield known well-conditioned matrices, some yield bad
ones (and should be avoided), in most cases you don't know and need
an estimate of the condition number.

These two sentences seem contradictory to me. If I "...want an idea of how accurate your results are.", how better to determine this than with the condition number? How am I going to avoid ill-conditioned matrices if I don't know if they are ill-conditioned? And, for this I do need to know the condtition number. Doing it in all cases is probably a good habit to cultivate.

It would not do to compute an SVD for all these cases, just because 'computers are fast and have lots of memory'.

Why not? It's up to the individual doing the work. If computing the SVD is slowing them down too much, then they can stop doing it. But if they want an accurate condition number, the SVD provides a handy way of getting it. Back in the days when one had to submit a computer program to your university computer center and wait 4 hours for output, it might have been a misuse of computer resources, but that's not how it is today. The computer is mine and causes no impact on anybody else if I compute SVD's. I can't see any reason to suggest that it's socially unacceptable to compute a lot of SVD's, especially when I consider how much useless computing all the PC's in the world do every day. :-)

My point in all these recent posts is that the SVD is a very powerful tool that is not well known just because it has been difficult to compute in the past. But it is not so any longer; it's even a built-in function on the HP48G/49G/49G+ nowadays. And, in particular, it provides a way of dealing with very ill-conditioned systems. Computing and using a lot of SVD's is good exercise for HP49G+ users, and I'm demonstrating all the myriad things you can do with SVD's. If there is another way to do something that can be done with the SVD, I choose to show how to do it with the SVD in these several SVD related posts I've been providing, and oftentimes the SVD is the "best" way to do it.

Which brings us to flag -54. As I said, RANK uses it, but SVD/SVL do not. I was not aware RREF used it? (I have never used RREF myself).

Do you say this because you read it in the manual, or because you examined the code? That you were not aware RREF uses it leads me to think that you haven't examined the SVD code in detail. Unless you can tell me that you have examined the code, I will continue to be wary of undocumented behavior. Anyway, I can't see any reason not to leave flag -54 turned on when using the SVD. As I said, I don't want the calculator deciding for me what the rank is; I want to look at the singular values and decide for myself, so I never use RANK.

The 48G contained a bug in the eigenvalue routines, for a 5-by-5 Hilbert matrix the result of EGV/EGVL/SCHUR/SRAD was 'Undefined Result'. I corrected it, the 49G gives the correct result.

Did this bug only apply to the 5x5 Hilbert matrix, or would it give bad results for other matrices?


The condition number computed as the ratio of largest to smallest singular value is ~7.54, whereas COND returns ~20.27. We didn't even get 1 digit of accuracy out of COND.

Rodger, COND does NOT compute an estimate for the ratio of the largest to smallest singular value, but for the product of the 1-norm of A and INV(A). It gets 12 digits of accuracy in this particular case.

If this accuracy is acceptable to a particular person, then they should use COND.

But it is acceptable. The condition number calculated by COND is an upper bound for the 'real one', but all you need is the magnitude, because that's a measure of how many digits you'll stand to lose in subsequent equation solving.

But if they want an accurate condition number, the SVD provides a handy way of getting it.

But all you need is the order of magnitude. You hardly ever need it accurately?

My point in all these recent posts is that the SVD is a very powerful tool that is not well known just because it has been difficult to compute in the past. But it is not so any longer; it's even a built-in function on the HP48G/49G/49G+ nowadays. And, in particular, it provides a way of dealing with very ill-conditioned systems. Computing and using a lot of SVD's is good exercise for HP49G+ users, and I'm demonstrating all the myriad things you can do with SVD's. If there is another way to do something that can be done with the SVD, I choose to show how to do it with the SVD in these several SVD related posts I've been providing, and oftentimes the SVD is the "best" way to do it.

No argument there, and very interesting reading it was!

(I said:) I was not aware RREF used it? (I have never used RREF myself).

(you said:)
Do you say this because you read it in the manual, or because you examined the code? That you were not aware RREF uses it leads me to think that you haven't examined the SVD code in detail.

? I say this because you mentioned it in your post. I fail to see why NOT having examined/used RREF leads you to think I haven't looked at SVD. Well, I did. And at EGV/EGVL/SCHUR. But not RREF. I also fail to see why you seem to have so much difficulty believing that.

As I said, I don't want the calculator deciding for me what the rank is; I want to look at the singular values and decide for myself, so I never use RANK.

Indeed. flag -54 is always set on mine, too.

Did this bug only apply to the 5x5 Hilbert matrix, or would it give bad results for other matrices?

? Of course. A 5x5 Hilbert was just the test matrix for which it showed up, and led me to examine the code in the first place.

Werner

Rodger, COND does NOT compute an estimate for the ratio of the largest to smallest singular value, but for the product of the 1-norm of A and INV(A).

And that product is an estimate of the "best" condition number. In that sense COND is an estimate of the ratio of the largest to smallest singular value, even though COND's algorithm doesn't involve the singular values directly, because the number COND is estimating, the "best" condition number, is the ratio of the largest to smallest singular value.

Anyway, COND isn't just computing an "estimate" for the product of the 1-norm of A and INV(A); it's doing its best to compute that product exactly. But even if it could compute the product exactly, it would still only be an "estimate" of the condition number. I suppose we could say that every calculation on the HP is an "estimate", but that's not what is meant when the product of the 1-norm of A and INV(A)is called an "estimate" of the condition number.

I consider the ratio of largest to smallest singular values to be an "estimate" of the condition number only because we're using floating point arithmetic and can't compute the "exact" singular values. If we could, we would have the "exact" condition number as the ratio of the largest to smallest singular value, not just an estimate. Since the "best" condition number is that ratio, then the product of the 1-norm of A and INV(A) and all other condition numbers derived from various norms are all estimates of the "best" condition number, the one derived from the SVD.

It gets 12 digits of accuracy in this particular case.

When I speak of the accuracy of COND, I'm not talking about the accuracy of its algorithm; I'm comparing its result to what I believe to be the "best" condition number. The condition number defined by the ratio of the largest to smallest singular values as calculated by the HP49G+ also gets 12 digits of accuracy in this case.

But it is acceptable. The condition number calculated by COND is an upper bound for the 'real one', but all you need is the magnitude, because that's a measure of how many digits you'll stand to lose in subsequent equation solving.

Acceptable to you maybe, but it is not necessarily acceptable to everyone. Infinity is also an upper bound for the "real one"; shall we then use that for an estimate of the condition number? The condition number given by the ratio of the largest to smallest singular value is the least upper bound, a property not shared by the condition number computed with any other norm. For my purposes, an estimate that is 3 times larger than the least upper bound is not acceptable, especially when it's so quick and easy to get the "best" condition number.

But all you need is the order of magnitude. You hardly ever need it accurately?

Some people may only need the order of magnitude, but when I'm trying to deomonstrate the the various properties of the SVD, it's good to show that there is a condition number that "accurately" predicts the maximum error magnification, and not just within an order of magnitude.

? I say this because you mentioned it in your post. I fail to see why NOT having examined/used RREF leads you to think I haven't looked at SVD. Well, I did. And at EGV/EGVL/SCHUR. But not RREF. I also fail to see why you seem to have so much difficulty believing that.

Perhaps I should have said "have you examined all the code that ultimately uses the results of SVD". You said that you had "...looked at quite a bit of the code in detail." I couldn't tell from that just how much code you had looked at. I'm just striving for clarity here, Werner. I'm not having any difficulty believing you; I just want to make sure that I fully understand what you say.

? Of course. A 5x5 Hilbert was just the test matrix for which it showed up, and led me to examine the code in the first place.

Of course? What you said in an earlier post was, "The 48G contained a bug in the eigenvalue routines, for a 5-by-5 Hilbert matrix the result of EGV/EGVL/SCHUR/SRAD was 'Undefined Result'.

It was not clear to me from your having said this, that bad results would occur for other matrices; clarity is what I seek here.

and you still are right, Rodger.

Apologies for the COND rant, I realized only afterwards that what you meant was of course that the COND result only agreed to 1 digit with the real condition number, and was a poor estimate as such.

Still - for Valentin's matrices, COND's results give you ample warning about the condition.. I wonder if there's a matrix to be found where the result of COND and the ratio of the largest and smallest singular value differ by, say, two orders of magnitude?
(barring the cases where COND returns 'Infinite Result').

You are also right in saying that this is the way it was 20 years ago (which was when I majored in Numerical Analysis, on the SVD),
and that things have changed a bit since then.

Still.. (here I go again): to solve a system of equations,
determine the condition number - if the condition number is small, go ahead and solve it in the normal way, else use SVD. But if you're going to use the SVD to determine your condition number in the first place, why not use it all the time, then? Well, because it's too costly. Not on your own PC, in your own time, of course. But as a general guideline.

By the way, the error in the 48G EGV(L) code occurred when 'chasing the bulge' when triangularising the Upper Hessenberg matrix. The 'bulge' is introduced by applying the Wilkinson double shift, that then has to be 'chased' away by Householder Transforms of order three. The very last transform to be applied is only of order two, and for a 5x5 Hilbert, it happens to stumble upon a degenerate case (a zero vector) that is NOT tested, and thus not recognized. Since the vector is divided by its norm further on, that is where the 'Undefinded Result' comes from. So, yes, this may surely happen to other matrices, too. But a 5x5 Hilbert matrix is a simple test case

Incidently, the Householder Transform of order three was also not checked for degeneracy; the 49G code now detect those cases - and the 'unit' cases as well (vectors that are already reduced to their first component, be it zero or not, don't need any transform, and the code returns right away). The more general Householder transforms used in the SVD/SVL subroutines (and in the EGV(L) reduction to Upper Hessenberg form) do include that check, even in the 48G.

For the SCHUR decomposition, that also fails for the 5x5 Hilbert matrix (for the reason explained above), but it works for the same matrix multiplied by (1,0), turning it complex. That's because the complex QR-algorithm uses a different (single) shift strategy, not a double shift as in the real case, and the degeneracy does not occur. The same trick cannot be applied to EGV/EGVL, because those commands recognize that they're dealing with a real matrix and perform the real algorithm anyway.

Best Regards,

Werner Huysegoms

Werner said:

Still.. (here I go again): to solve a system of equations, determine the condition number - if the condition number is small, go ahead and solve it in the normal way, else use SVD. But if you're going to use the SVD to determine your condition number in the first place, why not use it all the time, then? Well, because it's too costly. Not on your own PC, in your own time, of course.
But as a general guideline.

And..., here I go again: as I mentioned, my laptop computed the full SVD of a 1000x1000 matrix in 4.7 seconds. It would probably take days to collect and input the data for a real problem, which would likely not be even that big, and I should worry about 4.7 extra seconds? Even if my boss is paying for my time, he can't complain about that.

If computing a lot of SVD's is dominating my number crunching, and I really don't need it, then I would stop doing it.

But I don't see anything wrong with making a habit of solving systems with the SVD; if you do, then you get the accurate condition number along with the solution. If it's taking too long, then revert to traditional methods. And sometimes you should use the SVD even if your computer could get an accurate solution without it, as in the problem I describe below.

There was a famous paper published in The American Statistical Association Journal in September 1967 titled, "An Appraisal of Least Squares Programs for the Electronic Computer from the Point of View of the User", by James W. Longley.

Longley described a set of data (he worked for the Census Bureau, I think) for which he wanted to compute a least squares solution. He found that most of the computer programs available at the time didn't do a very good job. Many of them only got a few digits right, and some got no correct digits. Interestingly, the HP49G+ can get 11 correct digits in a few seconds using the LSQ function. Longley used a desk calculator to get an accurate many-digit solution. I imagine that took a long time. :-(

A subsequent paper appeared in the Journal of the American Statistical Association in March 1976, titled "The Acceptability of Regression Solutions: Another Look at Computational Accuracy", by Beaton, Rubin and Barone.

In this paper, the authors point out that several of the columns are highly correlated and should be dropped from the regression. This is a tactic often used in the past. In fact, for difficult to solve systems, an often used technique is (was?) to perform the regression using all possible combinations of columns; talk about a time waster! But the SVD technique provides a better solution, and can go right to it in one iteration. I'll discuss in another post.

The problem data are available at the NIST site:
http://www.itl.nist.gov/div898/strd/lls/data/Longley.shtml

The condition number of this problem is not that large (it's 23E6) for the HP49G+, which can easily obtain the full rank solution, but there are good reasons for using the SVD method to eliminate some of the useless basis vectors from the solution.

And here is another very good reason to know about and use the SVD based condition number; COND only works for square matrices. If you want to know the condition of a least squares problem, COND won't help you, although I guess you could compute the COND of TRANSPOSE(A)*A, (which may lead to numerical problems), and take the square root.

By the way, as you add more trailing 1's to the {1 2} element of Valentin's matrix, causing the matrix to become more and more nearly singular, the HP49G+ begins to return inaccurate values for the SVD based condition number because the smallest singular value is incorrect (COND also becomes more inaccurate). But the rank-reducing method of solving the system continues to work because the smallest singular value is not used in the solution. So even though the condition number is huge, the SVD solution method can still give good results.

I wonder if there's a matrix to be found where the result of COND and the ratio of the largest and smallest singular value differ by, say, two orders of magnitude? (barring the cases where COND returns 'Infinite Result').

Very interesting question. I'll have to think about this.