Up HP-35 hardware basic design

The purpose of this paper is to present the basic hardware design of the HP-35

The HP 35 is a bit-serial calculator organized to process 14-digit BCD words.

To clearly introduce this concept (which is not new in 1972), I will refer the reader to the text of the only patent concerning the HP-35: US Patent N° 3,781,820, granted to D. Cochran and T. Osborn (files May 90, 1972).

Although the HP-35 is just named therein a “portable electronic calculator”, 3 key inventions are well described: the stack with automatic push or pull, the ENTER key, and the CHS key.

This patent is obviously expired, but it keeps its great pedagogical interest.

One warning first: it is a “flip-flop and gates” description and not the real implementation realized with two level of micro-code:
- a micro-sequencer
- 3 Roms.

The HP-35 has a “bit-serial architecture” that is to say that all data circulating on the 3 main lines of its bus are “serial”; so timing considerations are crucial.

The basic clock frequency of the system is given by a simple oscillator driven by a LC cell (no crystal) ; the clock rate is 200 khz (800 khz divided by 4) so the bit time width is 5 μs and the digit time width is 20 μs (5 * 4).

As the serial word is a 14 digits word (mantissa 10 digits - exponent 2 digits, mantissa sign (1 digit) and exponent sign (1 digit) , the word time is 56 bits (14 digits * 4 bits).     

Fig 1 shows the timing for one BCD digit (4 bits):

During the bit time of 5 μs, 4 clock pulses of 625 ns each can be counted.


This timing -that rhythms the life of the machine- is created in the Anode driver circuit, together with the signals to drive each one of the 8 segments of one LED and the signals to synchronize the cathode driver (reset and step).

All this logic uses a TTL technology (mainly 5 flip-flops, 15 “and” gates, 3 “or” gates).

The clock signal delivered to the rest of the machine consists in a 2 phase clock (separated by 625 ns) consistent with the PMOS technology used in the  main chips of the calculator.

 

A circuit named “clock driver” converts the TTL clock levels to the MOS technology logic requirements.

Fig 2 above shows the BCD digit timing (4 bits) and fig 3 below shows the MOS clock (a swing from -12V to +6V.

Now, I return to the serial bit paradigm described in the patent n° 3,781,820.

I suppose there are 4 word times (multiples of the 56 bits word time) defined to correspond to the presence of the machine word in the register KB, M1, M2 and M3 of the calculator’s stack (that’s in the real implementation X, Y , Z, and T). 

 I simplified a lot the main diagram (fig 4) but it’s enough to understand 2 main things in the architecture.

In order to make Y (MI) and X (KB) available simultaneously a one-word time early tap 164 is provided on the stack.

(here I’ll follow closely David Cochran):

- during WI, gate 62 is closed and AND gate 92 is opened to pass KB into ADDER 90,

- at the same time AND gate 94 is opened by AND gate 96 to pass MI into ADDER 90, where the contents of KB and M I are combined and put on line 160.

- during W2 and W3 gate 92 is closed but gate 94 is left open pushing up the contents of M3 and M2 to M2 and M I respectively.

- during W4 gate 94 is closed and gate 62 is reopened leaving the contents of M3 un­changed so that M3 and M2 then contain the same data.

Finally, you’re now able to catch a main point: in this kind of calculator, the data is dynamically and constantly re-circulated into the registers.   

I advise to download the document and follow each path for digits entry, “ENTER”, “STO”, RCL” etc…

Remember the pattern describes a “flip-flop and gates” implementation as the real implementation is made with a CPU and 3 Roms.

You can understand now that why the serial registers of the Arithmetic Unit are 56 bits wide. The precise timing of the word time is given by fig. 5.
 

Two time windows are remarkable.

- Between b19 and b26: the 8 bit address is active on the bus (Ia line) originating from the control logic of the system (Control and Timing circuit) to be read by the ROMs : only one of the 3 ROMs is active at a time,

- between b45 and b54: the 10 bit addressed instruction is active on the bus (Is line) from the selected ROM to the Arithmetic and Register circuit, to be decoded and executed.
 

The “strobe” synchronizing the system (SYNC) is issued by the C&T during the window b45-b54. No other strobe is available and the ROMs are keeping the pace using their own synchronized 56 bit counters.
Two other lines form the system bus:

1) WS (Word Select) is issued from the C&T and from the ROMs to A&R enabling instructions to be executed only on portion of the word (exponent, mantissa, exponent sign etc.)
The pair C&T and A&R forms a very specialized CPU compared to the early microprocessors available in 1972 (e.g. Intel 4004).
2) the Carry line which task is to report a carry condition from the A&R to the C&T (to be used in arithmetic and by the “goto if n/c” instruction, for example).

Now I will open 3 other “black boxes” that I will detail more and more, as my description follows.


The core of the system consists of

-         the Control and Timing Circuit (C&T),
-        
the Arithmetic and Register Circuit (A&R),

forming together the specialized CPU,


-        
3 ROMS of 256 bits each, containing the firmware.

 

 

I will let apart (to get back to them later in details):
- the clock generation circuits (for now it’s enough to know that the 5 MOS devices (C&T, A&R and ROMs) are fed with the right 2 phase clock signal),
- the LED display logic -anode and cathode- (and consider that with 5 lines output of the A & R circuit we know how to display the 10 digits and the decimal point),
- the power supply unit and the power on logic,


Fig 6 Simplified diagram


           
I will focus on the main parts: the C&T, the A&R and the Roms.
Figure 6 gives a first schematic diagram.

The C&T and A&R can be seen as one function : the processing unit. In fact, in the next generation (Woodstock), they were packed in one chip.
 

(fig 7: C&T chip)

The C&T is the control unit of the system ; it has the following tasks:
-        
operating instruction counter and saving the return address,
-        
keeping the status bits,
-        
scanning the keyboard (8 rows x 5 columns),
-        
handling the pointer P,
-        
synchronizing the system (SYNC and WS signals)
 

Note that the HP-35 follows the Wilkes' model and that the C&T is microprogrammed.

A 1450 bit control ROM (58 words * 25 bits each) is used to generate the signals to control the data flow, reacting to qualifiers (PWO, timing, carry flip/flop etc.) ; each bit of this ROM correspond to (or is part of) a signal control line.

This is the most secret part of the system though one can “guess” the way it works.


(fig 8: A&R chip)

The A&R is the arithmetic unit and its main tasks are:
-        
decoding and executing instructions on registers (the ALU is not orthogonal (an instruction set is said to be orthogonal if any instruction can use any register) and certain instructions operate only on certain registers),
-        
handling the adder,
-        
handling the 7 register stack,

-         outputting the data to the display module (5 lines). 

As we have seen above 4 lines are inking these 5 units:
-    Ia (Addresses) C&T -> ROMs
-         Is (Instructions) ROMs -> A&R (and to C&T for « return addresses »),
-        SYNC which is the strobe signal synchronizing the 5 unis,
-        
WS (Word select) which select a part (field) in the 56 bit word,
-    a carry line links the A&R to the C&T.

With a basic “word time” rated at 280μsec the HP-35 is a fast small computer but slower than its predecessor the HP-9100A.

The HP-35 is made of MOS LSI circuits (a brand new technology in 1972) ; the HP-9100 is made of discrete components (diode and bipolar transistor logic).

 (fig9: The 3 ROMs)

The memory core access time on the 9100 is around 5 μsec ; on the HP-35 ROM access takes place between bit 27 and bit 43 of the word time (85 μsec), 17 times slower than the 9100.

Comparing processors is difficult, since the clock makes each processor do something each cycle, but what that it does can be significantly different.  A comparison makes only sense at the operation level when the algorithms are close:

HP-35
- floating point “+” and “-“(215 word times) = 60 ms,
- floating point “*” and “/” = 100ms,
- digit by digit ln and ex = 200ms,
- CORDIC trigo (tan, sin, cos) = 500ms.

HP-9000A
- floating point “+” and “-“= 2 ms,
- floating point “*”= 12ms,
- floating point “/” = 18ms,
- digit by digit ln = 50ms,
- digit by digit ex = 110ms,
- CORDIC trigo (tan, sin, cos) = 280ms.

If I take as a norm the FP multiplication time the HP-35 is only 10 times slower than the 9100A.

To give comparisons:

- the floating point multiplication takes 6 seconds on the Harvard-IBM 1946 Mark I, and 450 μsec on the 1953 IBM model 701),
- the IBM System/360 Model 91 was introduced in 1966 and was the fastest, most powerful computer then in use. The NASA operated a Model 91 in 1968 at Space Flight Center in Greenbelt, Md ; at the same time the HP 9100A was introduced (Barney Oliver initiated the HP 9100A project in late 1965). The machine executed up to 5,530,000 floating Point multiplications per second: 180.83 nsec each.

Another point of comparison is the CORDIC itself; developed at Convair, it successfully passed radar fix-taking test in 1962. The Model B specially made for airborne operations was a 30 bits, bit serial, binary computer with a time for multiplication or CORDIC operation of 5ms. Jack VOLDER wrote recently that only 5 B-58 bombers survived (with their Cordic, I presume) in aircraft museums.   

Finally, note that Hewlett-Packard laboratories built in 1970-71 a high speed hardware 12 decimal digit floating point processor based on Cordic (8 bit parallel adder/subtracter) and controlled by micro program in ROM.

The execution time was impressive:
- tangent= 130  μsec,
- logarithm= 70 μsec

(fig10: SYNC signal)
Here is now a small story board of the operations, starting at power on.

A power on circuit (a 2 bipolar transistor and 5 resistor one shot) forces the system in a clean and known condition.

It sets the starting 8 bit address octal 000 ROM 0 and gives time to the system 56 bit counter to be synchronized with the 56 bit counter in each ROM 

This is done by just holding the PWO signal at logic 1, in order to have the counters synchronized and the ROM 0 selected.

 The first instruction at address 0000 in ROM 0 must be a “go to” or a “call” (PWO routine) and operations can begin.
 

(fig 11: The 2 time windows, address and instructions (with Sync)

I’ll give more information on this later, but note that addresses are relative to the selected ROM (one among 3).

The address buffer is 8 bit large and the C&T can address linearly the range 000-377 (octal).

So in each ROM there is an « enable » flip flop mechanism that decodes itself a «select ROM x » instruction: the right ROM is ON the others OFF.

For example, the code for key “1/x” starts at 0016 (rom 0) – see listing - if we capture the signal on the Is line, here is what we will see (key “1/x” actuated):

address    opcode                             

0001110:  1011101110                                  0 -> a[w]
0001111:  1111100010                                  a + 1 -> a[p]
0010000:  0000101110                                  0 -> b[w]
0010001:  0010010000                                  select rom 1     
0010010:  1010011001                                  jsb 246
0010011:  1010010100                                  if s10 = 0
0010100:  1101101011                                  then go to 332 

the first 4 instructions are in rom 0 and the last 3 are in rom 1 (the 2 targets 246 and 332 also).

Fig 10 shows the SYNC signal, while Fig 11 gives an insight of the two HP-35’s “time windows”:  the 8 bit “address window” first and next the 10 bit “instruction window” synchronized by SYNC.

 (fig 12: address and instruction on the ISA, Woodstock)

The photo in fig. 12 has been taken on a HP-25 idling in the "wait_a_ key" loop (addr 0745, 0746), displaying “0.00”

In this architecture the 2 signals Is and Ia are mixed on the ISA line.

The time windows are slightly different.

Each part will be details in sub chapters.

 

 

Revision 1, 7 January 2007
© Jacques Laporte 2007.
All photos J. Laporte