35s vs. 33s speed for identical programs



#2

I wrote a program for my 33s that computes area moment of inertia and centroid for composite cross-sections based on n rectangular elements.

I keyed this in on my new 35s, and was surprised to find the 35s slower. I'm an engineer, but know very little about electronic hardware.

Any thoughts?

If anyone would like a program listing of it, I can post later.

ECL


#3

The 35s is slower than the 33s in some areas.

This is discussed in the free 35s review from Datafile. Find it on hpcc.org

It appears that there is a speed penalty for some looping AND when numbers are entered on the stack...possibly due to the new data types of vectors and complex numbers (37 bytes put on the stack compared to 12).

#4

Someone else mentioned this in an earlier thread--and it was suggested that the additional features (vectors, complex) may increase the overhead demands on the processor.

#5

Yes please post your program. Thanks


#6

Here is the code listing (as requested) for the area moment of inertia and horizontal neutral axis for a composite body that is discretized with rectangles:

I001 LBL I

I002 SF 10

I003 EQN IXX and Neutral Axis **enter the text as an eqn**

I004 CF 10

I005 CLVARS

I006 INPUT N

I007 INPUT B
(note: I have omitted line numbers from here on)

INPUT H

RCL H

2

/

RCL +T

RCL B

RCL * H

*

RCL +Q

STO Q

RCL B

RCL H

*

RCL +R

STO R

1

RCL +V

STO V

RCL Q

RCL / R

RCL - C

X^2

RCL *A

RCL + I

STO I

RCL T

RCL H

2

/

+

RCL Q

RCL / R

-

X^2

RCL *B

RCL *H

RCL H

3

Y^X

RCL *B


12

/

+

RCL + I

STO I

RCL Q

RCL / R

STO C

RCL B

RCL * H

RCL + A

STO A

RCL T

RCL +H

STO T

RCL V

RCL N

X>Y?

GTO I007

I= **This is the value of composite Ix for your cross-section**

C= **This is the location of Xbar, ie the X neutral axis**

RTN

LN=220
CK=704D

Here is an example for an I-beam:

XEQ I

ENTER

**screen will read: IXX AND NA (hit R/S to continue)**

N=3 **number of sections**

R/S

B=5 **width of bottom flange**

R/S

H=0.25 **height of bottom flange**

R/S

B=0.5 **width of web**

R/S

H=8 **height of web**

R/S

B=3 **width of upper flange**

R/S

H=0.25 **height of upper flange**

R/S

(Program halts to VIEW the area moment of inertia, units length^4)

R/S

(Program halts to VIEW the location of the x centroidal axis)

R/S

The result should be

I = 54.6660 (units = length^4)

C = 3.9063 (units = length, measured from base)

Enjoy! I realize that this may now be shortened a bit, particularly in light of the new programming capabilities gained with the 35s. In progress!


#7

Can you edit your post to put [pre] [/pre] around the code listing? This will shorten listing and make code more readable. You can also put in the occasional line number (say every tenth one) to help one to check entry errors.

Les

Edited: 8 Aug 2007, 4:05 p.m.


#8

I think this is more readable. I transcribe my programs in a spreadsheet. Makes line numbering easy.

I001  LBL I
I002 SF 10
I003 IXX AND NEUTRAL AXIS
I004 CF 10
I005 CLVARS
I006 INPUT N
I007 INPUT B
I008 INPUT H
I009 RCL H
I010 2
I011 /
I012 RCL+ T
I013 RCL B
I014 RCL * H
I015 *
I016 RCL+ Q
I017 STO Q
I018 RCL B
I019 RCL H
I020 *
I021 RCL+ R
I022 STO R
I023 1
I024 RCL+ V
I025 STO V
I026 RCL Q
I027 RCL/ R
I028 RCL- C
I029 X^2
I030 RCL* A
I031 RCL+ I
I032 STO I
I033 RCL T
I034 RCL H
I035 2
I036 /
I037 +
I038 RCL Q
I039 RCL/ R
I040 -
I041 X^2
I042 RCL* B
I043 RCL* H
I044 RCL H
I045 3
I046 Y^X
I047 RCL* B
I048 12
I049 /
I050 +
I051 RCL+ I
I052 STO I
I053 RCL Q
I054 RCL/ R
I055 STO C
I056 RCL B
I057 RCL* H
I058 RCL+ A
I059 STO A
I060 RCL T
I061 RCL+ H
I062 STO T
I063 RCL V
I064 RCL N
I065 X>Y?
I066 GTO I007
I067 VIEW I
I068 VIEW C
I069 RTN
I added the view commands (67 and 68)


Edited: 9 Aug 2007, 8:18 a.m.

#9

Even though someone has provided a 35s-optimized version, the HP-35S is about 2.5 times slower than the HP-33S in Xerxe's Benchmark. When running the HP-32S/SII/33s version the HP-35s is even slower. It appears everytime new features are added to the same hardware there is a decrease in performance (HP-32S -> HP-32SII, HP-48G -> HP-48GX, for instance).

Quote:
If anyone would like a program listing of it, I can post later.

It would be great. Better yet you might want to submit it to the Software Library (HP-33 section), where it would be permanent and easier to find. BTW, once I wrote a similar program for the HP-49G/G+. It defaults to Portuguese but there's a command to change all messages and screens to English:

http://www.hpcalc.org/details.php?id=4446

Gerson.


#10

Someone seriously needs to do a tear-down on this puppy and see if it is underclocked, and if so, why. It might be possible to bump it.

Just a thought.


#11

I was expecting the HP-35s to be slower than the HP-33s, but not that slower. I would say underclocking makes a bit of sense but it seems the battery life expectancy is the same in both specs...


#12

I found that the 41/42 benchmark was a little faster than the 33S. I have a little trig program that doesn't do much math and it is almost in lock step when run side by side with the 33S. I'm guessing my little tiny amount of math programming is not complex enough to show the true difference.

I have a little stat based program I think I will see if I can convert to see the differences there if any. That'll be a pain to convert as I relied heavily on the two index pointers, I&J. I'm off for another week due to arm surgery so I need something to take my mind off Daytime TV. I wrote it first for the 41C and it is very slow there. But I could not use the built in deviation command on the 41C because my number of samples and the possible deviation of samples was too small and I had to do it longhand. It's not bad on the 35S.

I don't use mine for complicated calculations so I don't see the speed loss to the degree others may.

Edited: 7 Aug 2007, 11:57 p.m.


#13

Quote:
I found that the 41/42 benchmark was a little faster than the 33S.

I hadn't remembered it was you who wrote the 35s version, despite of the recent thread about the 35s benchmark test. Sorry!

Quote:
I have a little trig program that doesn't do much math and it is almost in lock step when run side by side with the 33S.

Isn't it the other way around? The program in the link below, which contains mostly basic operations like multiply and divide runs much slower on the HP-35s.

http://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/archv017.cgi?read=118993

Regards,

Gerson.

Edited: 8 Aug 2007, 12:33 a.m.


#14

It should be but my particular program is two little calculations and 90% screen output so the speed difference is lost under the display outputs and pauses. The stat program has a nice lump of number crunching done on four sets of data so the speed difference will definitely show.

One thing about processing speed is that it gets lost in significance if the machine spends time waiting for operator interaction. The more interactive the application, the less the speed factors in.

Quote:
Isn't it the other way around? The program in the link below, which contains mostly basic operations like multiply and divide runs much slower on the HP-35s.

http://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/archv017.cgi?read=118993

Regards,

Gerson.


Edited: 8 Aug 2007, 9:34 a.m.

#15

Hi Gerson!

Quote:
It appears everytime new features are added to the same hardware there is a decrease in performance (HP-32S -> HP-32SII, HP-48G -> HP-48GX, for instance).

I'll admit that I'm not really an HP-48G expert, but from what I've read, the reason that the HP-48GX is slower than the HP-48G, in some situations anyway, is because the GX's additional memory requires bank switching to access. As long as you keep your code within the 48G memory space (I guess that means port 1?), the two should be identical in terms of speed.

OT: 15 years ago, I experienced a similar phenomenon when I upgraded my Macintosh II by adding a 68851 PMMU chip. The upgrade meant that programs could work in a 32-bit address space, as opposed to the 24-bit address space of the original Mac II -- but the PMMU's address translation overhead meant that some memory-intensive programs were significantly slower after the upgrade than before.

If the HP-35s hardware is more or less the same as the HP-33s', then the slowdown that people have been mentioning here is probably due to a simplistic implementation of the Complex and Vector types. The HP-42S, by comparison, can deal with non-real types with little or no speed penalty, but it uses a more sophisticated memory management scheme, where objects (real, complex, or matrix) are used by reference instead of by value -- this is a bit trickier to implement, but much more efficient; it only uses as much memory as is actually needed for each type of value, and it can perform copy operations by copying just a pointer, rather than having to copy dozens of bytes for the whole value each time.

I suppose the HP-35s designers were more concerned with functionality than performance, which is probably reasonable... If they had been able to use a Saturn (or an ARM-based Saturn emulation), they could have used the HP-42S code, and no one would have complained about performance at all.

Did I mention I think they should bring back the 42S? ;-)

- Thomas


#16

God dag, Thomas,

thanks for the information! I was thinking about the memory organisation in the 42S just yesterday and *guessed* it must be by reference, because all those different data types could be held in simple registers and on the stack. Now I *know*, thank you.

Regards, Walter


#17

Guten Tag Walter!

I don't remember offhand where I found out about the HP-42S memory menagement -- I think it was either mentioned in the manual or in the Programming Examples and Techniques book. No details, mind you, just the general idea of using references and copy-on-write.


You can easily see the effect of this architecture by observing what happens when you recall a large matrix to the stack (MEM drops by only a few bytes, regardless of the size of the matrix) and then multiply it by 1 (MEM drops by slightly more than rows*columns*8 bytes (twice that for a complex matrix)). If there isn't enough memory to allocate the new matrix, this means that the multiplication will fail with an Insufficient Memory error message.

- Thomas

#18

Quote:
Did I mention I think they should bring back the 42S? ;-)

I'd be pleased with Free42-in-a-box :-)

Regards,

Gerson.

#19

My friends, I take it you never hear of Gates Law (or maybe you forget about)? Meaning hardware seem to run slower even though processor, memory, etc is faster. It all due to programming. It happen in three steps normally.

First step is that program is written in assembler. You have to be really dumb to write slow assembler code, but it can be done, but very very hard as assembler is very efficient.

Next step, is applications are written in some other language that has more overhead (like C with a computer). They use more memory than assembler because of reuse of processor resources , subroutines, etc. program are larger, because more memory available.

And step three introduces new programming language (C++ or C#) with even more lack of efficiency because of excessive reuse of objects, managing more overhead operations, and it gobbles up even more memory and more processor. Programmer though not care, because he knows he has more memory and faster processor to work with, so this okay to him (or her... sorry ladies)

The reason that developers though go through these steps though is to make more features to us. Take a look at an x386 computer running DOS6, and a Pentium4 running Vista "double dog" ultimate version with "flux capacitor". Does pentium 4 computer boot up faster than x386 computer running DOS6? Nem! x386 boot up much faster, but it also have much less features. Now we must figure out what we want. Do we want more feature, or fastest booting machine? Same true with calculator. Do you want 48GX with more memory, better graphics, more prompts, but take longer to get to solution, or do you want to have a simpler calculator that is a little more hard to use, but very fast at getting you answer because it managing less overhead?

Take for example, I have program that I load on my 15C last night that calculate roots of a polynomial. I also can use 48gx to do. With units both starting off, it is quicker on 15C to find roots than on 48gx (assuming I already enter program on 15C and I am at the input screen for the 48gx.) On 15C, I enter values into stack and hit R/S, on 48gx, I have to manipulate input screen, navigate a few times, and then press solve. Much more key strokes with 48gx, and more overhead make 48gx all work and no play (and slower). ;)


#20

Quote:
You have to be really dumb to write slow assembler code, but it can be done, but very very hard as assembler is very efficient.

Next step, is applications are written in some other language that has more overhead (like C with a computer).


Actually, it's very easy to write inefficient assembler code. Assembler is not efficient; it just makes it possible to write efficient code. In my experience, a good C compiler generates better code than an "average" assembler programmer. To write really good assembler code, you have to really understand the machine and know all the "tricks".

Been there, done that.

Stefan


#21

This is an interesting point and very useful.

#22

As someone who used to write code generators for high level languages (most notably, Ada), I can vouch for Stefan's point. Assembly is *potentially* the most efficient language, but it is very easy to write crummy assembly language code.

If you are solidly familiar with the target device, then you learn how to optimize and write good assembly code. The result can be very fast and very efficient. But it's all too easy to get complacent, or lose track of the big picture (especially in big programs) and that will result in poor code.

A well-written code generator for almost any high level language will usually out perform any hand-written assembly code. Note I stress "well-written". A crummy code generator implementation is about as bad as crummy hand-coded assembly. Compilers have been getting better and better over time, though, and most modern HLL compilers will easily produce great code.

Um, now is anyone going to write a C cross-compiler for the 35s? ;-)

thanks,
bruce


#23

Problem is, most programmers now days coming out of university have never even touched assembler code, and many university do not educate on writing well thought out code. Programmer then write sloppy high level code.

Best programmers that I have seen have been those who are familiar with assembly, and also know high level language.

Regardless of this, I think the summary of what I was trying to say above is that HP is building more overhead into the calculator, and thus the calculator is slowing down even though the hardware is faster... thus Gates Law.

#24

Someone somewhere said (may be even elsewhere in this very Forum):


As time passes, software becomes slower much faster than hardware becomes faster.


#25

Google pointed me to...

http://www.seas.upenn.edu/~gaj1/shiftgg.html

THE COMING SOFTWARE SHIFT
BY
GEORGE GILDER

which contains this quote:

In software, complexity has long been rising exponentially, while
power has been rising additively. In response, Niklaus Wirth, the
inventor of Pascal and other programming languages, has propounded two
new Parkinson's Laws for software: "Software expands to fill the
available memory," and "Software is getting slower more rapidly than
hardware gets faster."

Ren
dona nobis pacem


#26

Quote:
In software, complexity has long been rising exponentially, while power has been rising additively. In response, Niklaus Wirth, the inventor of Pascal and other programming languages, has propounded two new Parkinson's Laws for software: "Software expands to fill the available memory," and "Software is getting slower more rapidly than hardware gets faster."

Speaking for myself, that's not my experience at all! Software development tools, Microsoft Office, image manipulation software, all run much faster on my current 1.4 GHz P4 laptop, than their primitive 1980s versions ever did on my (then top-of-the-line) Macintosh II. Even if it wasn't for things like Eclipse and DVD ripping/transcoding software, that simply won't run at all on the PCs of yesteryear, I still wouldn't want to go back. I enjoy nostalgia as much as the next person, but progress rocks! :-)

- Thomas

#27

Gerson,

Will do. I ported it already to my 35s (and took advantage of the GTOxyz capability to eliminate the use of flags).

I'd like to try to optimize it a bit too, but...

I may have gotten my inspiration from your RPL code on hpcalc.org back in 2005 for this program!

I was frustrated by the keyboard on my 49g+, and decided to write a single-register program on my 33s to give me Ixx and centroidal info.

ECL


#28

Nice work and well fitted to this particular need! Easier to enter than { 0 0 5 .25 1 2.25 .25 2.75 8.25 1 1 8.25 4 8.5 1 } :-)
But the answers agree with your program (54.666015625 and 3.90625 on the 50g, and 54.6660156249 and 3.90625 on the 35s).

I wrote the first version of the program out of necessity the night before an examination. Well, sort of, as I misundertood the professor's statement: we had to write a report about the programs we intended to use during the examination, not to write them ourselves... The RPL code is far from optimized but I won't get back to it because the HP-50g is fast enough. Besides, I don't have to use it anymore :-)

Regards,

Gerson.

#29

)Calculator Benchmark List

Interessting to find the HP 50g (User RPL) not being faster than the C64 (1982) 8 bit with a 6510? CPU at 1 MHz (interpreter-Basic)

I don´t know at which speed the 50g runs (ARM-processor?). But if I wouldn´t seen it here I would have believed the 50g being much faster than a C64 if somebody had asked me.

Isn´t the 33s / 35s processor a 6502 clone, thus a close relative of the C64 (ok, the CPU-surounding architecture certainly plays a decisive role)


#30

An extract of the QueenBench:

 -   90.3  HP-50G    User RPL / 75 MHz
-
- ~67 HP-50G User RPL / Fast Mode x1.3 (75->203 MHz)
-
- ~64 C64 Basic / 1 MHz

There are three reasons for the speed of the C64. The Basic in the ROM is a very light interpreter and not as complex as UserRPL is. The ARM have to emulate the Saturn CPU instructions. The 6502 instructions need less cycles than most other CPUs of its time and even later that makes it fast in comparison.

Assuming that all CPUs are clocked at 1 MHz, the QueenBench in Assembly language would produce approximately following results:

 -  SC61860  611 msec
-
- LH5801 397 msec
-
- HD61700 284 msec
-
- Z80 283 msec
-
- SC62015 262 msec
-
- 80188 251 msec
-
- 68000 220 msec
-
- 6502 100 msec

O.T. This was one reason why the 6502 was so popular for chess computers.


Possibly Related Threads...
Thread Author Replies Views Last Post
  48G vs 49G+ User RPL Speed Comparison John Colvin 7 295 11-16-2013, 10:07 PM
Last Post: Han
  Date/time programs for the HP 35s R. Pienne 0 133 10-03-2013, 02:37 PM
Last Post: R. Pienne
  WP-34S: Speed of y^x Marcel Samek 1 161 09-14-2013, 07:31 PM
Last Post: Paul Dale
  33s, 35s & 42s--The Timex(R) Factor Matt Agajanian 7 323 09-13-2013, 12:28 AM
Last Post: Matt Agajanian
  WP-34S function execution speed ? Gene Wright 4 234 09-04-2013, 05:40 PM
Last Post: Paul Dale
  New HP 35S Programs posted on my blog Eddie W. Shore 2 166 06-08-2013, 03:47 PM
Last Post: Glenn Shields
  hp 35s programs in software library Andrew Nikitin 7 355 05-30-2013, 11:55 AM
Last Post: Dave Hicks
  Programs for 15C and 35S Eddie W. Shore 25 814 05-23-2013, 03:58 PM
Last Post: Eddie W. Shore
  HP-39gII speed Mic 2 193 02-24-2013, 05:55 PM
Last Post: Thomas Klemm
  Calculator Speed Benchmark (Add Loop) Thomas Chrapkiewicz 2 224 01-20-2013, 11:24 AM
Last Post: Thomas Chrapkiewicz

Forum Jump: