HP Forums

Full Version: 12c: Anyone have a good way to program this? :-)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.

http://www.xycoon.com/dispersion.htm

It is the "coefficient of dispersion".

Given that it needs the median, this appears problematic for the HP12c.

I would think it requires:

1) keeping all individual data values,
2) sorting them,
3) finding the median
4) then going through all the data values to compute the C o D.

That seems like a real pain. Any suggestions?

if it must be the median, you have to store (at least) half of the input values. otherwise you can't be sure to have the n/2'th largest.

Yep. That's the problem!

I'm pretty sure that the cash flow registers could be used, but to get the median, some sorting would have to occur, and on the HP12c, that's just painful.

And, the number of data points would be variable, so you can't hard code for 10 of them each time.

Just wondering if anyone saw a way to do this without this much pain/suffering.
Gene

Hi Gene, folks;


maybe I'm wrong, I have once seen it and don´t remember where, but isn´t it numerically equivalent to

1/r
(inverse of the correlation coefficient)? If so, it is easier to compute through summations and linear regression.

As I said, maybe I'm wrong.

Cheers.

Luiz (Brazil)

Why not use the ratio between the standard deviation and the mean? That should give you a good feel for the dispersion of data.

As the other posts indicated, getting the median requires the storage and sorting of data. Maybe that's why statisticians use the mean more than the median, since you can accumulate data using a sequence of single observations (storage of ALL the data is not mandatory).

Namir

Luiz: I'll check on 1/r.

Namir: I know there are plenty of measures of dispersion, but in this case, the individual wants this specific one, the formula for which requires the median.

I just wanted to make sure I wasn't overlooking something. :-)

It's not too bad if you determine the median outside of the 12c and input it separately. But storing individual values and sorting them and then evaluating each of them for a variable number of data points? Yuck!

Gene,

I suggest using Excel to enter sample data (one or more sets) and then do the following:

1) Sort the column of data
2) Determine the mean value
3) Calculate CD
4) Use Excel built-in functions to calculate the ratio of the standard deviation and the mean.
5) Compare the values of the CD with sd/mean

You can test the above with several sets and see how the two statistics compare with each other.

Gene,

I guess with a 12c you are in big trouble... You can't escape the median, and doing it on the 12c is not practical at all.

About the previous suggestions:

a) the coefficient of dispersion is not 1/r (sorry Luiz).

b) ratio between the standard deviation and the mean measures the same thing (actually, that is what is generally termed coefficient of dispersion), but it is not robust to outliers in the data. That is the advantage of using the median.

I guess you either have to compute the median outside the machine or upgrade to a 15c!

Sorry for the bad news.

Joao

Gene,
Darn, I missed this. You can use Katie's bubble sort (published here in Oct 2003) for the 12C. Only 21 lines. Adding another 20 or less will give you the median. Maybe 15 more for the average of the absolute deviations - but it's more fun, and more "robust" (when we suspect outliers in an otherwise "normal" data set - see www.rsc.org - they have a technical bulletin on this - a google on "robust mad rsc" brings it up) to actually overwrite the data with their absolute deviations then re-run the bubble sort to find the median (rather than use the average). Should be possible in 57 lines total - leaving room to store 14 data points! Almost useful, but I'm too late, but as always it was a very interesting question. Cheers, Tony

Hi, Gene:

Gene posted:

"I would think it requires:
1) keeping all individual data values, 2) sorting them, 3) finding the median 4) then going through all the data values to compute the C o D. That seems like a real pain.

Any suggestions?"

    Yes, two:

    1. Sorting provides much more information than is required to find the median, i.e.: there are ways of finding the median without doing a full sort first.

    2. Anyway, if you opt for sorting, you don't need to perform a sort on the stored values as a separate process. Matter of fact, you also need to input the values themselves when running this program, so the best strategy is to mix input and sort, i.e., to sort the values on the fly as you're inputting them: after requesting each next element, just simply insert it at its proper place at once. This way, as soon as the last element is entered, they're all already sorted out and finding the median is immediate. A simple final loop finishes the computation.

    All of this is perfectly implementable in a few steps, and as the time required for the sort is distributed among the time required to input each element, it would feel faster to the user.

Best regards from V.

Hi Valentin!

I'm certain that I'm not the only one who would really love to see what you can come up with (I already know Tony would too).

You always have a way of finding unique, amazing solutions to problems like this and it would be very helpful/instructive to me (and others I'm sure) to see what you'd do with this problem.

I have seen Tony's approach and it does about what I'd expect ( but certainly more optimized than I'd come up with)...it sorts, it uses the CF registers, etc.

But, I have no doubt a non-sorting approach would be very amazing. Please share? :-)

If we're talking about the good old basic 12C, I'm going to go out on a limb and suggest that it's going to be nearly impossible to do better than the bubble sort program that I wrote with picking the middle element after the sort completes.

Insertion sort during input as Valentin suggest is no better than bubble sort as far as overall time is concerned and it might be more annoying to the user to have to wait an increasingly long time between entries.

There are linear-time median algorithms but all require at least O(log(n)) additional memory and a good amount of programming complexity in that you'd have to simulate recursion.

Katie,

For fun, I took your 12C 21 liner below and changed it to do an
insertion sort, appended below.

I needed 5 extra lines and 1 extra register. One could say
this requires 33/21 (approx pi/2) extra resources :-)

Insertion sorting does seem more efficient, even on the 12C.
For example sorting 12 numbers takes about a minute via bubble
sort, whether the numbers are already pre-sorted in ascending
or descending order. The insertion sort is assymetric in that
it takes 12 seconds if the numbers are pre-sorted in ascending
order, and the full minute if they are pre-sorted in
descending order. It appears that the speed/resource trade-off
works in favour of the insertion, even on the 12C.

Personally I still really like your Bubble sort - for me it is
a classic 12C program, and really made me see a whole
new dimension hidden in the 12C.

----------------

Bubble Sort on HP12C

Posted by Katie on 26 Oct 2003, 11:11 p.m.

It's easy to use, just fill up registers 0 - n with your data
and call the program with n in the x register.

Keystrokes |Display | Comments
[f][P/R] | |
[f]CLEAR[PRGM] |00- |
[STO][i] |01- 44 12 | i = maximum register
1 |02- 1 |
[STO][n] |03- 44 11 | A: n = 1
[RCL][g][CFj] |04- 45,43 14 | B: y = A(n)
[RCL][g][CFj] |05- 45,43 14 | x = A(n-1)
[g][x<=y] |06- 43 34 | if (x <= y)
[x<>y] |07- 34 | swap x,y
[g][CFj] |08- 43 14 | A(n-1) = x
[RDN] |09- 33 |
[g][CFj] |10- 43 14 | A(n) = y
[RCL][i] |11- 45 12 |
[RCL][n] |12- 45 11 |
1 |13- 1 |
[+] |14- 40 | n = n+1
[g][x<=y] |15- 43 34 | if (n <= i)
[g][GTO]03 |16- 43,33 03 | goto B
2 |17- 2 |
[-] |18- 30 | i = i-1
[g][x=0] |19- 43 35 | if (i = 0)
[g][GTO]00 |20- 43,33 00 | return
[g][GTO]01 |21- 43,33 01 | else goto A
[f][P/R] | |
----------------------------

Insertion sort. 26 lines plus 1 extra register (PMT)

Keystrokes |Display | Comments
[f][P/R] | |
[f]CLEAR[PRGM] |00- |
[STO][PMT] |01- 44 14 | PMT = maximum register
[CLx] |02- 35 |
[i] |03- 12 |
[RCL][i] |04- 45 12 |
1 |05- 1 |
[+] |06- 40 |
[STO][i] |07- 44 12 | 1+i=sublist length
[STO][n] |08- 44 11 |
[RCL][g][CFj] |09- 45,43 14 | B: y = A(n)
[RCL][g][CFj] |10- 45,43 14 | x = A(n-1)
[g][x<=y] |11- 43 34 | if (x <= y)
[g][GTO]22 |12- 43,33 22 |
[x<>y] |13- 34 | swap x,y
[g][CFj] |14- 43 14 | A(n-1) = x
[RDN] |15- 33 |
[g][CFj] |16- 43 14 | A(n) = y
[RCL][g][CFj] |17- 45,43 14 | decrement n
[RCL][n] |18- 45 11 |
[g][x=0] |19- 43 35 | finished?
[g][GTO]22 |20- 43,33 22 | do next sublist
[g][GTO]09 |21- 43,33 09 | continue
[RCL][i] |22- 45 12 | test sublist length
[RCL][PMT] |23- 45 14 |
[g][x<=y] |24- 43 34 |
[g][GTO]00 |25- 43,33 00 | finished.
[g][GTO]04 |26- 43,33 04 | sort next sublist
[f][P/R] | |

Cheers,
Tony