#### Topic: What number representation to choose for a homebrew calculator?

One of first things a calculator-tinker has to do in his project is to consider what number format to use.
This choice is crucial for either the prospective math library and the way how to display numeric data.
There are two fundamental ways to implement numbers in a small calculators:

- Binary coded decimals (BCD) as used in original HP engines.
- Binary floating point formats as used on PCs.

Variant a) comes in two basic flavours: Packed and unpacked BCD while there are more design details
to consider, for example are exponents stored in sign/magnitude or in complement format?
Variants of b) usually differ in the amounts of bits to be dedicated to mantissa and exponent. Standardized
(IEEE) floating point formats use exponents to the base of 2 in a biased format. We will see later, that other
scalings (16 or 256) of the exponents may also be interesting. In such case the number is interpreted as
(–1)^S · M · 16^(E – bias) or (–1)^S · M · 256^(E – bias) instead of (–1)^S · M · 2^(E – bias).

Let’s discuss advantages and drawbacks of different ways to represent numbers. In the classic HPs a 56bit
BCD format is used as follows:

``  S ’ M M M M M M M M M M ’ X E E``

Each place represents a packed BCD nibble of 4 bits. Exponents are stored in 10’s complement while for the
mantissa sign/magnitude is used (see also Jacques’ page, HP35  ->  operating -> output format).
It is to note that HP calculators use a dedicated decimal CPU which provides a large and specialized set of
decimal operations (addition, subtraction, shifting of arbitrary parts of a decimal register). For example shifting
the mantissa one digit to the right takes one single 10bit machine-instruction.

Available microcontrollers suited for making calculators usually work with binary numbers where elementary
instructions can process one byte at a time. Operations on registers have to be implemented manually, either
in C or Assembly language, taking much more program space than in HP originals.

Decimal formats

+    Original HP algorithms from the math section may be used as they are without any adaptations, assumed
that the basic operations (addition, shift, etc.) have been implemented.

+    Since decimal numbers are stored the way users want to read them, only little formatting has to be done
(rounding to match current display format is fairly easy). It is not necessary to perform complex radix
transformations known from binary environments.

–    Available microcontrollers are optimized for binary operations. Decimal arithmetics / shifting are somewhat
harder to implement and run slower, especially if one likes to use packed BCD.

–    Data densitiy (means: accurracy) is worse than in binary formats, especially when using unpacked BCD to
reduce complexity. As an example lets consider a number of 12 digits accurracy and exponent’s range ±999:
In packed BCD it would take 17 digits (12 mantissa, 2 signs, 3 exponent) or 68 bits. Unpacked formats are
out of discussion taking as much as 136 bits. With binary formats one would need 40 bits for the mantissa,
1 bit for the sign and 10 to 13 bits for the exponent including it’s sign: about 52 bits all together (13 nibbles).

Binary formats

+    better data-density and better fitted to popular microcontrollers (see above).

–    radix conversion necessary after number entry (dec -> bin) and prior to display output (bin -> dec). This
operation is quite slow (10 to 40ms in the first VTQ model, so I don’t care ...). But it is also very complex
requiring as much as 3000 – 4000 bytes of program memory (in my first model VTQ–9). I assume it will
get better after having implemented logarithms.

–    Simple decimal numbers like 0.3 can not be stored in it’s precise form (there is always a rounding error,
since 10’s system doesn’t match the 2’s system).

+    Better scalability of exponent. As mentioned above, the exponent may be stored and interpreted in various
ways thus making it a 2’s, 4’s, 8’s or in general 2j’s exponent. Let’s have a look at this from the point of view
of normalization: By using 2’s exponent, incrementing the exponent by 1 has to be compensated by a bit
shift right of the mantissa. Use an 16’s exponent and you have to replace the bit-shift by a nibble-shift, or
with an 256’s exponent by a byte-shift. Conclusion: Control granularity by choosing different exponent’s bases
(see separate post «Binary Formats & Arithmetics»).

My Choice

Despite the hard times I expected to face trying to implement floating-point radix conversions, I fell for a fully
binary format in my first model VTQ–9. My registers were 64 bits wide containing 48 bits of mantissa, 13 bits of
exponent, 1 bit for the sign, 2 bits on reserve:

``SrrEEEEE’EEEEEEEE’MMMMMMMM’MMMMMMMM’MMMMMMMM’ ... ’MMMMMMMM``

48 bits of mantissa give about 14.4 decimal digits (log10(248)). I «officially» stated an accurracy of 12 digits
while using the remaing 2.4 digits as a reserve to save accurracy. 13 bits of exponent allows a dynamic range
of 2±4095 or 10^±1232, enough for my desire to let the range be in 10^±999.
In all my existent and projected models I use a Microchip Pic 18F6520 controller (8bit engine with 16k words of
program memory and about 3k bytes of data memory).
Techically, a register in the first model is an array of 16bit-integers:

``````  typedef unsigned int  uint16;  // primitive system type
uint16 reg;                 // "register" allocated on stack

// example of function prototype:
void add_2 (uint16* a, uint16* b);``````

Arithmetic implementations in the first model are discussed in a different thread.

For my upcoming second model VTQ–13 I started wondering again, if I should stick with my binary format or
change to BCD. Expecting to significantly save program space for radix conversions once having implemented
logarithms, I again kept the binary system because of the better data usage and faster operation.

Basic arithmetics and normalization on my mind, I however made a change to the exponent’s granularity,
replacing the old 13bit binary exponent by a 10bit exponent to the base 256, thus defining a number in the
256’s system.
During my works on the Math library for the VTQ–13 I encountered neither the 2’s system nor the 256’s system
to be optimal for all situations (todate I already implemented addition/subtraction, multiplication, division,
logarithms, exponentials, square roots, while tangent is on the way). It has shown up, that in most cases
exponents to the base 16 are the best choice. If I had to refactor the Math-library, I would thoroughly use 16’s
exponents to store floating point numbers. In the next section I compare numbers to the bases 2, 10, 16 and 256
in respect to arithmetic and transcendental functions.

Number-systems compared (2, 10, 16, 256)

Let’s have a closer look at these 4 systems by analyzing their behaviour in digit-by-digit processes as they are
used by HP’s original algorithms. Two questions are of interest:

- How many steps are nesessary to perform pseudo-operations (namely divisions)?
- How many constants have to be stored for meggitt-processes, as they are used to compute trgonometric functions, logarithms etc.)?

First, we focus division. Here we need a set of subtractions to perform a pseudo-division, but there are no
constants to store. We assume an accurracy of 14 decimal digits and compare the number of subtractions for each system:

In a decimal system, we would need 7 · 14 = 98 subtractions in an average case with all «5» digits in the
result (we perform 5 successful subtractions, a sixth-one with underflow and a restoring addition which is
assumed to be as expensive as a subtraction (in fact, my latest implementations of addition and subtraction
both run 6.4us).

In binary we need to evaluate 48 binary digits to be equivalent to 14 decimals: For each step we have to
perform one comparison (which is, in fact a non-destructive subtraction) and an average of 50% of all
digits a real subtraction. So we have 1.5 subtractions per bit, thus 72 subtractions.

In a hex-system we need (in analogy to the decimal system) 9 · 12 = 108 subtractions (average case
with all «8» digits).

In a 256’s system with all 0x80 digits, finally we need 6 · 130 = 780 subtractions!

If we want to perform calculation on a logarithm for instance, we need to store constants (see Jacques’
articles on this subject: HP–35 Æ logarithms). If we assume the number of constants is proportional to the
number of digits, we need in the worst case 6 constants for the 256’s system, 12 for the 16’s system, 14
for the decimal system and 48 for pure binary computing (in practice the actual number is somewhat less,
but proportions match the reality):

``````       System        Subtractions        Constants
-----------------------------------------------
2               72                 48
10               98                 14
16              108                 12
256              780                  6``````

Both the extreme systems deliver poor performance in one of the two disciplines while the decimal and the
16’s system provide in all cases a quite-close-to-the-optimum behaviour.

Can we profit from the advantages of the 2’s and the 256’s system without having to suffer from their
drawbacks? The answer ist: «Sometimes yes.» It is an interesting observation that the mantissae of
all three binary-oriented systems (2, 16 and 256) are equvalent! Let’s have a look at the number 39.25
(all representations written in hex for better readability):

256’s system: +27.40’00’00’00’00 e000
16’s system: +2.74’00’00’00’00’0 e001
2’s system: +1.3A’00’00’00’00’0 e005

All three representations share the same bit-pattern (for the 2’s system shift one bit to the left to see ...).
The only differing things are the radix-point and the exponent.

In my implementations I use this observation to freely travel between these 3 systems if I can draw some
profit out of it. The only thing you have to care is to adjust the exponent the right way. So I divide mantissae
in the 2’s system, compute logarithms and tangents in the 16’s system and perform additions in the 256’s
system. - For details see my upcoming posts on math implementations.

#### Re: What number representation to choose for a homebrew calculator?

Very good work,
Here are a few thoughts about the number representation.

The BCD decision is older than the HP35. The choice between BCD and binary representation started back in the early days of the HP 9100 project. Binary calculations are generally faster than BCD calculations. However, accuracy was a prime concern for the HP9100 designers and they wanted to avoid  conversion errors from BCD to binary and back.

Finally, they decided to use BCD representation in the HP 9100 calculator and next in the HP-35 and all HP Calculators.

The basic word time is 56 bit long (14 digits ; 10 mantissa, 2 exponent, 2 signs) :  14*4=56.

This is the ‘word time’ of the machine : the data is constantly re-circulated in the 56 bit registers.

The chunk of data is a ‘digit’ (4 bit) and the A&R was designed accordingly, operating natively in BCD. Note an important point: NO data is fetched and loaded from memory, as in a standard micro controller like the 18F6520 and there is no parallel bus. The flow of data is serial in the 56 bit race track (shift register).

My first point is to underline the big difference of nature between the two contexts of the modern context of an 8bit microprocessor and the 40 year old HP tech.

I personally would not add difficulty in choosing number representation. I would *probably* implement a BCD automaton.

That leads me to my second point.

My second point is that, at the micro code level, there is a lot of material that could guide the algorithm implementation. And we now know deeply how the implementation was made by HP, 40 years ago.

Implementing the Meggitt algorithms for an 8 bit binary processor is a real new project (interesting but really difficult).

Think of the ‘accuracy’ problems with the sin and cos routines for small angles. Cochran has been there 40 years ago, but for a decimal scheme.

Binary implementation = new game and no help!

Implementing a BCD virtual machine on the PIC basis will give you the opportunity to execute the HP ancient code like a ‘p-code’, avoiding many man-months of test and debugging effort.

I hope all this will help,

Jacques

#### Re: What number representation to choose for a homebrew calculator?

In my first model VTQ-9 (at the moment the only one ;-) I implemented conversions from binary to BCD and vice versa from scratch. I obviously had to do it to transfer input to (binary) register or register to output. Since there were no implementations of logarithms and exponentials for the VTQ-9, the implementations for theese conversions were very expensive in terms of programming space (this was also true for development time, but this is an other story): Concerning all auxiliaries one sixth of all program memory was used only fpr conversions.

So the question "binary or BCD" is up again.

Currently there is already a well grown binary lib containing arithmetics, logarithm, exponential, square root and a bunch of auxiliaries. So a switch to BCD doesn't come easy (in that case I would have to implement an emulator for a BCD processor).

I decided to do the following thing: I will implement both the conversions bin -> BCD and BCD -> bin now, but this version by using logarithms and exponentials. If accurracy and simplicity are acceptable, I will keep my binary train. Otherwise I will have to join the "BCD community".

#### Re: What number representation to choose for a homebrew calculator?

[2012-02-02] Conversions between BCD- and binary formats.

To finally decide whether to continue with binary formats or to switch to BCD I spent a bunch of time to implement
base-conversions for floatingpoint-numbers. The current version I made now uses logarithm- and exponential-functions
implemented earlyer (october/november 2011) to simplfy computations. Obviously, the results of conversions are
affected by the quality of logarithm- and exponential-algorithms (these two functions will be discussed in a separate thread).
The conclusions first: It works, but only a few tests were performed so far. As precision of logarithms depend on
the arguments and (!) the choice of number representation (base 2, 16 or 256?), the quality of the results may vary.

An example:

log16(10) = 0.83048202327...
- If this number is represented in 256s system, you wil get 212.603398072 x 256^-1. The significant part of mantissa is 48 bits long.
log16(17) = 1.02186571031...
- This number is represented in 256s system as 1.021865711031 x 256^0. Obviously 7 bits of significance are wasted, resulting in 41 bits only.

As we will see later, the fractional part of a logarithm will be important to calculate converted mantissa. So if there
is a drop of precision (occurs in numbers >=16, <1/16 ,some very big numbers as >16^256 and very small numbers <16^-256),
the result will be of lower quality. There is cure for this, but will be discussed later. In the following algorithms bin is a binary
floating-point number, dec floating-point number in BCD.

Algorithm 1 : conversion binary to decimal

1.1  Compute decadic logarithm of bin: y = log10(bin)
1.2  Separate integer- and fractional-part of y: e = ip(y);  m = fp(y)
1.3  Compute binary representation for decimal mantissa: m' = 10^(fp(y)).
1.4  Extract BCD digits from m' continuously taking the digit left of point and then multiply m' by 10 (12 times for my BCD numbers).
1.5  Convert decimal exponent, represented by ip(y), to BCD form (continuously dividing by 10).
1.6  Put it all together and consider signs of exponent and the whole number: You get dec.

One thing has to be considered in step 1.3: If bin is less than 1, y will become negative. So will ip(y) and fp(y). For ip(y), this
is ok, since it MUST be negative for values less than 1. But a negative fp(y) will disturb our process of BCD-generation in step 1.4, because a
negative fp(y) will lead to m' < 1 when computing 10^(fp(y)). This m' will be, of course, normalized and comes with a mantissa that is too large
by the base of binary float (in my case 256).
To avoid this, do the following, if fp(y) is negative: (1) add 1 to fp(y); (2) subtract 1 from ip(y). The second step serves to compensate the first one.

Code listing example for this algorithm:

``````void bin2dec(uint08* bin, uint08* dec){
// by H. Veitschegger - 2012-05-01
uint08 a, b, aux;
uint08 pos=11;
sint16 exp;
clearRegister(dec);
if (testMantissa(bin) == 0)      // case: bin == 0.0
return;
copy(bin, a);
if (getSign(a)==-1){             // drag sign to dec & remove it from bin (so bin >= 0)
setNibble(dec, 15, 0xF);
setSign(a, 1);
}
log16(a);
mul_2(lg16, a);
copy(a, b);
integerPart(a);
fractionalPart(b);
if (getSign(b) == -1){           // make fp(a) positive & decrement ip(a)
copy(one, aux);
setSign(aux, -1);
}
mul_2(ilg16, b);
exp16(b);
clearRegister(aux);              // extract decimal mantissa
aux = 0x0A;
b = b = 0;
while (pos!=0xFF){
setNibble(dec, pos--, b);
b = 0;
mulMantissa(aux, b);
}
exp = a;                      // exponent into dec
if (a==0)
exp = (exp<<8) + a;
if ((a & 0x80) != 0)
exp = -exp;
exp += 1000;
dec = exp;
dec = exp>>8 | dec;
}``````

.
Algorithm 2 : conversion decimal to binary

2.1  Extract sign and exponent of dec and store it in s and e respectively.
2.2  dec's mantissa m' is processed now. It must be converted in binary form m without changing it's value. see details below.
2.3  Assemble result by computing 10^e and multiplying it with m (automatic normalization included, since full VTQfloat arithmetics are used).
2.4  Insert signs.

Detail to step 2.2: In this process, it is necessary to perform BCD-arithmetics, but nowhere else. Notice that 1 <= m' <10. The first step is easy:
Just take the digit left of decmal-point, put it into bin and clear it in dec. Now, dec is less than 1.
To obtain more binary digits, we have to multyply m' by some power of 2 and then jump to the first step. Repeat this until you get enough bits.
If we choose 2^1, we will get one bit per loop. Since I have no implementation for access to one single bit, I may choose 2^4 or 2^8. In my
implementation i multiplied subsequently by 16 in order to get one nibble per loop.
How did I perform multiplication? I didn't want to write a common function for BCD multiplication, so I went a simpler, yet probably slower, way in
terms of running-speed: It is quite easy to write code for doubling a BCD number, which I call 4 times per loop to get x16. It is merely the
same code found in the HP-emulators (by Eric Smith/David G. Hicks, as presented on Jacques's Site) to mimic the microcode-operations for
doubling values.

Heritage informations

How did I convert binary to decimal and vice versa in the times of my first calculator-project (VTQ-9) without having logarithms & exponentials?
Here is coming a brief explanation of of the old bin2dec() to convert binary to decimal:

1. Treatment of special cases as zero and negative exponents

2. Computing the decimal exponent
To get the decimal exponent in binary form, I divided bin subesquently by powers of 10, starting at 10^512, 10^256, 10'128...
If a quotient for a certain step was greater than 1, I stored a 1-bit, otherwise I went back to the number before division and stored a 0-bit.
With each loop I gained one more bit of the decimal exponent.

3. Computing the decimal mantissa
After Step 2, we have a residual m' with 1<= m' < 10. Now we can take step 1.4 from algorithm 1 to build BCD mantissa.

4. Assemble all parts including signs.

Although this algorithm appears to be as simple as the new ones, it took a lot of auxiliary functions and 10 long constants to achieve
the desired results. Much more memory has to be used, and running speed was worse.

Outlook

1. Exponents.
It is in general a good decision NOT to use sign-magnitude representation for exponents in BCD numbers like I did in my first firmware-
version (VTQ-9). HP uses 10's complement, so negative exponents like -1, -2 are not stored as 901, 902 but as 999, 998. A different
approach is used in binary IEEE formats: exponents are stored with an offset of a half of it's whole range (example IEEE 32bit: exponent
has 8 bits, offset is +127). In my project I am using the offset strategy for both, binary and BCD. Since exponents are stored in complement
form while processing, there is a code overhead when exponents are to be extracted or stored in floats of either type. The idea to use
complement form all over would be subject to concern.

2. Precision.
There are 2 ways to improve precision in converting numbers:

(1) Since logarithms are crucial for the quality of conversion, especially it's fractional part, it may be a good idea to modify the existing
function log16() so it can be commanded to return fractional part in full precision only (this is no hard work, because, in my implementation
bits get lost, when the integer part ip is added and normalization is being done. So with ip in 1..255 we loose 8 bits, with ip in 256..256^2-1
we loose 16 bits).

(2) I am using binaries at the base of 256. 16 would be much better, because the ripple of precision changes would be 3 bits instead of 7.
Looking back to the math-part of the project, I have to admit: There is almost no function profitting from the choice of 256 as base. If I
will refactor the math-library in the future, I surely will switch to the base of 16 (a lot of functions work at base 16 already today!).

3. Profile of the new functions.
The new versions of bin2dec & dec2bin have slimmed down compared to the first implementations
(ca. 1700 bytes instead of 5000). The speed of execution didn't improve much though: the expensive divisions of of the early versions are
replaced by expensive transcendentals ;-).  While the old versions were running at 10 and 47ms respectively, the new-ones both take 13 to 17ms.
But - The only place, where speed dosen't really matter, is input and output in calculators.

#### Re: What number representation to choose for a homebrew calculator?

Hi,

Back to the comparison between different number bases, I was surprised not to find the mention of the base 100.

This base seems nevertheless often used in compact BCD libraries to find a compromise between the ability of a byte to represent two simple digits and the ease of implementation of arithmetic routines.

The base 100 can be appreciate as a compressed BCD format which considers putting two digits (d1 and d2) in one byte using base 100 (d1*10+d2) instead of putting two digits in the same byte to save memory space (d1*16+d2).

In this case, there is no need to an uncompress procedure to separate two digits for computations. Computations are directly performed in base 100. In the basic addition and substration routines, the number of atomic add and sub operations are divided by two (no need to extract d1 and d2).

That could give a good argument for keeping a BCD orientation in arithmetic routines.
When printing a number based on a base 100 representation, particular attention should be paid to the place of the comma.

What do you think ?

François.