# HP Forums

Full Version: Nibble reverse (HP-48,49,50g)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.

FWIW,

```NR:
%%HP: T(3)A(D)F(,);
\<< B\->R DUP 4 MOD DUP
2 MOD 15 * SWAP 1 >
6 * + OVER + 2 /
SWAP 7 > 3 * - R\->B
\>>
```

This reverses the bits of a nibble.
For example:

```# 1101b NR   -->  # 1011b
# 0100b NR   -->  # 10b
# 0110b NR   -->  # 110b
Size    ChkSum
-----+-------+--------
48GX |  83   | # 1613h
49G  |  79   | # 5816h
50g  |  79   | # 5816h
```

Gerson.

```\<<
# 5d DUP2 AND SL
ROT SR ROT AND OR
# 3d DUP2 AND SL SL
ROT SR SR ROT AND OR
\>>
```

```       Size    ChkSum
-----+-------+--------
48GX |  81   | # 3128h
```

This is a direct translation of:

```def reverse(x):
x = ((x >> 1) & 0x5) | ((x & 0x5) << 1)
x = ((x >> 2) & 0x3) | ((x & 0x3) << 2)
return x
for x in range(0x10):
print "%x: %x" % (x, reverse(x))
```

Which prints:

```0: 0
1: 8
2: 4
3: c
4: 2
5: a
6: 6
7: e
8: 1
9: 9
a: 5
b: d
c: 3
d: b
e: 7
f: f
```

Kind regards

Thomas

Just noticed that I don't need to mask (x >> 2) with 0x3. Thus the function reverse can be simplified to:

```def reverse(x):
x = ((x >> 1) & 0x5) | ((x & 0x5) << 1)
x = (x >> 2) | ((x & 0x3) << 2)
return x
```

This makes the program a little shorter:

```\<<
# 5d DUP2 AND SL
ROT SR ROT AND OR
# 3d OVER AND SL SL
SWAP SR SR OR
\>>
```

```       Size    ChkSum
-----+-------+--------
48GX |  76   | # A8C2h
```

Cheers

Thomas

Same size on the 50g (76 bytes), and about five times faster than mine since it uses only binary operations. The difference should be greater on the 48GX.

Cheers,

Gerson.

Somewhat shorter on the 34S:

```        WSIZE 04
MIRROR
```

I should have had the bit mirror command take an argument rather than relying on word size :-(

- Pauli

Quote:
• MIRROR: Why not "m -> r"?

That's when I became first aware of this command of the WP-34S.
Still wondering why this command is considered useful.

Kind regard

Thomas

Coding theory uses this operation at times. I believe FFTs can make use of it too.

- Pauli

Unfortunately I need this on the 50g, so I'll use Thomas Klemm's fast code. I intend to write a few binary numbers as a graphic object using the GROB ommand. For instance, 35027 and 65535. The output would be

```*   *   ** *  **     (1000100011010011b or 88D3h)
****************     (1111111111111111b or FFFFh)
```

However GROB 16 2 88D3 returns the nibbles mirrored:
```Graphic 16 x 2
*   ** ****
****************
```
So, the nibbles have to be reversed first before using the GROB command (GROB 16 2 11BC). Another problem is the hex sequence apparently can't be programmed (my program generates the string of hex digits, but I don't now how to make GROB use it -- I copy and paste it to the command line which is rather cumbersome).

Gerson.

Edited to fix a typo

Edited: 26 July 2013, 11:07 p.m. after one or more responses were posted

If you're willing to accept a larger program in return for more speed, you can try something like this:

```<<
{ 0. 8. 4. 12. 2. 10. 6. 14. 1. 9. 5. 13. 3. 11. 7. 15. }
SWAP 1. + B->R GET R->B
>>
```

Or smaller (64 bytes) and faster:

```<<
"AIEMCKGOBJFNDLHP"
SWAP 1. + B->R DUP SUB NUM 65. - R->B
>>
```

Hello I have not written any SAturn ASM for a while, and I have not tested the code, so it might not be 100% accurate...

but this should reverse your nibble relatively fast :-)

CODE
A=DAT1.A AD1EX D1+10 C=DAT1.1 % Real first nibble of the # on sack level 1

P=C.0 LC 84c2a6e195d3b7f0 P=0 % load appropriate nibble in C.0
DAT1=C.1 % write it on the hex
@

I was trying to get rid of "65. -" in your 2nd example. But it's not possible to enter CHR 0 directly:

```Error:
Can't Edit Null Char.
```
What's the reason for this? As far as I understand strings aren't \0 terminated in RPL.

So I transformed the program to a string using \->STR, duplicated it and cut each part off where I intended to insert the 0 character. Then I created a string using CHR 0 and added all three in the correct order. With this string I was able to compile the desired program with OBJ\->.

Is there a simpler way to do that on the HP-48GX?

Cheers

Thomas

> and I have not tested the code

here is the corrected version that compiles fine on my ROM 2.15 machine:

CODE

A=DAT1.A AD1EX D1+10 C=DAT1.1 % Real first nibble of the # on sack level 1

P=C.0 LC 84C2A6E195D3B7F0 P=0 % load appropriate nibble in C.0

DAT1=C.1 % write it on the hex

PC=(A)

ENDCODE

@

> But it's not possible to enter CHR 0 directly:

> What's the reason for this? As far as I understand strings aren't \0 terminated in RPL.

IIRC, the NULL character is used as endmarker for the editline.

HTH,

Andreas

http://www.software49g.gmxhome.de

Edited: 26 July 2013, 8:36 a.m.

That's what I was trying to get:

These lines are the first few tens of primes, in binary format.

```GROB 16 128
0004000C000A000E000D000B0088008C008E008B008F004A0049004D004F00CA
00CD00CB002C002E0029002F00AC00A90068006A006E006D006B00E800EF001C
0019001D009A009E009B005C005E005B00DC00DA00DF0038003A003E00BC00BF
007C007A0079007F00F800FD0808080E080B080F088A0889088D084A08CC08CE
085C085A085F08D808DE08DD08380839083B083F08BC08BF087E087D08FC08FE
08FB0409040D048B044C044B04CC04C904CD0428042D04A804AE04A904AF046A
0469046D04EE0418041C041E041B049C049A0458045A045D04DC04DB043A043F
```
This will return the graphic object for the first 128 primes.

I think it's just because someone decided that 0 doesn't belong in UserRPL strings. It's somewhat easier in SysRPL.

It's the little things like this that keep the 50g from being a perfect calculator. Have you ever tried Sigma-LIST on a one-element list?

I am totally illiterate in Saturn Assembly Language, but it looks like Cyrille's aproach, like Kioshi Akima's, is a look-up table. That's what I had first thought of, but decided to find a conversion formula instead. It turns out that the table approach is much better in this case.
I have at least to learn how to compile the ASM code. Anyway, I don't thing its faster speed will be of much help in my inefficient program (it takes about 15 seconds to return the 512-byte hex string above -- I am using Thomas Klemm's program to reverse the nibbles).

Best regards,

Gerson.

Quote:
IIRC, the NULL character is used as endmarker for the editline.

This makes sense as I was able to create the program with a string that contains the NULL character but now I'm not able anymore to edit the program.

Kind regards

Thomas

Thanks for the explanation.

Cheers

Thomas

Hi Gerson,

copy everything between

Code ... @

as a string to your 50g.

Type 256 ATTACH [ENTER] on your 50g.

Type ASM [ENTER] on your 50g.

Level 1 should now show a Code-Object.

Warning: There is _no_ error checking in the SASM Object!

HTH,

Andreas

http://www.software49g.gmxhome.de

```%%HP: T(3)A(R)F(,);
\<< 1 + B\->R "084C2A6E195D3B7F" SWAP DUP SUB "#" SWAP + OBJ\->
\>>
```

59.5 bytes on both the HP-48GX and 50g. Hex mode assumed, of course.

Cheers,

Gerson.

Thanks, Andreas!
Leaving now. Will try it later.

Gerson.

it works in any base:

```%%HP: T(3)A(D)F(,);
\<< # 0h SWAP 1. 3.
START DUP # 1h AND ROT + SL SWAP SR
NEXT +
\>>
Size    ChkSum
-----+-------+--------
48GX |  68.5 | # 4059h
50g  |  68.5 | # 087Dh
```
-Edwin-

Edited: 26 July 2013, 3:48 p.m.

Thanks for your version. As they say, there's more than a way to skin a cat :-)

Here are the timings and sizes for what we've got now:

```
Time in seconds to reverse all 16 distinct nibbles
50g                                 48GX

TK  0.3743 0.3735 0.3708   76    |  76    0.6111 0.6121 0.6155   TK
KA  0.6176 0.6188 0.6190   60    |  64    0.5347 0.5303 0.5369   KA
ED  0.6414 0.6377 0.6353   68.5  |  68.5  1.0044 1.0037 1.0033   ED
GWB 1.8229 1.8317 1.8253   79    |  83    0.6615 0.6620 0.6578   GWB
0.8558 0.8545 0.8540   59.5  |  59.5  1.3468 1.3521 1.3453
```

Cyrille's "relatively fast" ASM code is hors-concours :-) (I'm still gettin "asm Error: Invalid File").

Gerson.

The SysRPL translation of my second program (without the 65 -) comes out at 38.5 bytes and quite a bit faster than the UserRPL version:

```!RPL !NO CODE
::
"\00\08\04\0C\02\0A\06\0E\01\09\05\0D\03\0B\07\0F"
SWAP HXS># #1+ SUB\$1# #>HXS
;
@
```

Here is another solution adapted from this algorithm:

```\<<
# 82h * # 294h AND
# 111h * SRB # Fh
AND B\->R
\>>
```

Tested on a 48SX, but should work on the newer ones too.

Edit: Here is a short explanation of the algorithm:
The first multiplication generates two copies of the bit pattern, left shifted by 1 and 7 bits, respectively. Then, the AND operation is used to select the bits which are at the correct (reversed) positions. The second multiplication combines the bit patterns again, with the reversed nibble in the 3rd position.

```                            abcd
*                    1000 0010 (0x82)
------------------------------
0abc d00a bcd0
&               0010 1001 0100 (0x294)
------------------------------
00b0 d00a 0c00
*               0001 0001 0001 (0x111)
------------------------------
00b0 d00a 0c00
00b0 d00a 0c00
00b0 d00a 0c00
------------------------------
00b0 d0ba dcba dc0a 0c00
>> 8
------------------------------
00b0 d0ba dcba
&                         1111 (0xF)
------------------------------
dcba
```

Edited: 27 July 2013, 12:48 p.m. after one or more responses were posted

49-50 only :

`\<<  BIN \->STR TAIL SREV TAIL "#" SWAP + STR\-> \>>`

Probably not the fastest but 45 bytes only

Development library must me attach (256 ATTACH)

Edited: 27 July 2013, 10:28 a.m.

Quote:

[/pre]
So, the nibbles have to be reversed first before using the GROB command (GROB 16 2 11BC). Another problem is the hex sequence apparently can't be programmed (my program generates the string of hex digits, but I don't now how to make GROB use it -- I copy and paste it to the command line which is rather cumbersome).

Gerson.

Whith your hexa string on the stack try :

\<< "GROB w h " SWAP + OBJ\-> \>>

replacing w and h by the value of width and high of the picture

ex :

```\<<
"GROB 16 128 " SWAP + OBJ\->
\>>
```

Edited: 27 July 2013, 11:44 a.m.

This is the fastest User RPL program so far. Thanks for your contribution!

Your B\->R instruction has been omitted in the byte count in the table.

```-----------------------------------+----------------------------------
50g                  |                48GX
-----------------------------------+----------------------------------
auth        time (s)         size  |  size        time (s)        auth
|
TR   0.2625 0.2684 0.2671   74.5  |  74.5  0.4238 0.4335 0.4332   TR
TK   0.3743 0.3735 0.3708   76    |  64    0.5347 0.5303 0.5369   KA
KA   0.6176 0.6188 0.6190   60    |  76    0.6111 0.6121 0.6155   TK
ED   0.6414 0.6377 0.6353   68.5  |  83    0.6615 0.6620 0.6578   GWB
(*)  0.8558 0.8545 0.8540   59.5  |  68.5  1.0044 1.0037 1.0033   ED
GWB  1.8229 1.8317 1.8253   79    |  59.5  1.3468 1.3521 1.3453   (*)
-----------------------------------+----------------------------------
(*) this is the one based on Kioshi Akima's idea, slightly
shorter but slower (even much slower on the 48GX).
```

These are the routines for test and timing:

```HP 50g:
\<< 15 0
FOR i i R\->B NR -1
STEP
\>>
HP-48GX:
\<< TIME 15 0
FOR i i R\->B NR
-1
STEP TIME 18 PICK
HMS- 10000 *
\>>
NR is the program under test.
```

Corresponding running times for SysRPL and ASM versions, if available, are welcome.

Gerson.

Ah, I knew there had to be a way! Thank you very much!

Gerson.

I would place BIN in the main program and add OBJ\-> in the end. That's still 45 bytes. This does reverse the bits of a binary number, but not the nibble. For instance:

```# 1b     -->   # 1 b
# 1100   -->   # 11b
# 1000   -->   # 1 b
# 100111 -->   # 111001
```
This might be useful for some applications, however.

Cheers,

Gerson.

That's a nice solution. Thanks for the helpful explanation of the algorithm!

Cheers

Thomas

Using Cyrille's/Andreas' ASM code, I created a code object on the 50g and called it from the small app you identified in your previous post. I wrapped all of that in between two TICKS calls to obtain the timings below for 10 iterations (all values are in seconds):

```0.0890
0.0892
0.0894
0.0895
0.0897
0.0894
0.0900
0.0892
0.0894
0.0895
```

The code object "NR" is 32 bytes in size (CRC #339Ah).

So as long as you don't care about error-checking, this would appear to be a good approach. :-)

The result is a surprise to me as I was assuming that using a lookup-table is faster than translatimg it using elementary operations. This is another proof that you always have to measure.

Cheers

Thomas

Thank you!

Quote:
So as long as you don't care about error-checking, this would appear to be a good approach. :-)

Perhaps some three or four-second saving in the running time of the TSTBPRM program below. It takes about 45 seconds using Thomas Klemm's NR (not 15 as stated elsewhere in this thread) and 42 seconds using Thomas Ritschel's.

Gerson.

---------------------------------

```%%HP: T(3)A(R)F(,);
DIR
BPRM
\<< HEX 16 STWS DUP \->STR "GROB 16 " SWAP + " " + "" ROT 1 1 ROT
START NEXTPRIME DUP R\->B NOT H2STR MNINV SWAP UNROT + SWAP
NEXT DROP + OBJ\->
\>>
H2STR
\<< \->STR "h" "" SREPL DROP TAIL TAIL
\>>
NR
\<< # 82h * # 294h AND # 111h * SRB # Fh AND
\>>
MNINV
\<< " " SWAP + 1 4
START TAIL DUP HEAD "#" SWAP + OBJ\-> # Fh XOR NR H2STR SWAP
NEXT DROP + + +
\>>
TSTBPRM
\<< TIME 128 BPRM TIME ROT HMS- 10000 *
\>>
END
```

Edited: 27 July 2013, 7:55 p.m.

This algorithm would be very suitable for implementation in sysRPL using 20 bit system integers. Quite possibly another speed gain to be had but it still wouldn't compete with assembler.

- Pauli

This application was a good mental exercise for me to attempt a SysRPL/ASM version to see how it would turn out (though it felt a bit like cheating at times :-) ). I used the code you posted above as a model and created new versions of BPRM and TSTBPRM.

Instead of building one large string and converting it with OBJ->, I created a new SysRPL routine (with embedded ASM utilizing CdB's method) that directly converts an integer into a single row GROB representing its value. Each row is then appended to bottom of the "summary" grob in a loop.

Not surprisingly, the speed improvement was quite noticeable. Whereas the version you posted above took about 41 seconds on my 50g, the SysRPL/ASM version gave the same output in just over 4 seconds.

What did surprise me, though, was how much smaller it ended up being. I'm more used to ASM code objects being relatively large for the functionality they provide (a result of many more instructions being needed for small incremental operations). But in this case, the problem lends itself quite well to the bitwise manipulations that are a strength of ASM code. As a result, the UserRPL version weighed in at 444.5 bytes, whereas the SysRPL/ASM version only needed 265.5 bytes.

I'd be happy to post the code I used if anyone's interested.

Please do post the SysRPL code David, it would be educational.

Following threads like this is very useful for me. Thanks to all that posted dif ideas, and to Gerson for the original post and compiling and posting the stats.

--bob prosperi

Edited: 28 July 2013, 4:59 p.m.

Quote:
Each row is then appended to bottom of the "summary" grob in a loop

This appears to be more memory-effective. I'll try to use your method next time (in User RPL).

Quote:
I used the code you posted above as a model and created new versions of BPRM and TSTBPRM
...
Whereas the version you posted above took about 41 seconds on my 50g, the SysRPL/ASM version gave the same output in just over 4 seconds.

Perhaps if you had started from scratch, you'd have come out with an even faster version. That was the first time I ever used GROB in a program, so I might have not used the best approach for the task.

Quote:
I'd be happy to post the code I used if anyone's interested.

That would be nice. Even though I know neither SyrRPL nor ASM, I could compile it and see how it fares on the 49G and 50g.

Best regards,

Gerson.

Quote:
Perhaps if you had started from scratch, you'd have come out with an even faster version. That was the first time I ever used GROB in a program, so I might have not used the best approach for the task.

If I were doing this from scratch, I'd probably pre-allocate the space for the final grob first, then move the pixel data to the appropriate place in the final file while looping. It's a little messier to do that, but I believe there would be a noticeable savings in execution time (especially for larger grobs). Doing it that way would mean there's much less movement of data each time a row is added. As it stands now, there's at least the potential for a new copy of the main grob to be made with each loop iteration. So as it stands now, each row addition slows down as the grob gets larger.

I realize now that I made another change, which is to output a 20-bit row instead of 16. It's a minor change, but didn't want that to confuse anyone.

The source code follows, exactly as it was compiled by Debug4x. If you want to try this with the built-in assembler (ala menu 256), you'll have to pull it apart into separate pieces. I've attempted to use enough comments to make it readable even if you're not familiar with SysRPL/ASM:

```RPL
INCLUDE DirMacro.s
ASSEMBLE
Dir <Num2Grob>
RPL
::
CK1NOLASTWD    ( must have 1 parameter )
CK&DISPATCH1   ( integer or real will be converted to real )
real  ( any other type in stack level 1 will generate error )
::
%ABSCOERCE              ( don't want negative, and convert to system binary )
NULLHXS BINT16 EXPAND   ( create an object of the appropriate size in stack level 1 )
CODEM
SAVE  % save RPL status registers
% obtain the address of hxs, move into D1
D0=(5) =DSKTOP
A=DAT0 A
D0=A
A=DAT0 A
D1=A
% change the ob type from HXS to GROB
LC(5) =DOGROB
DAT1=C A
% set row and column size (ob size is already set)
D1=D1+10
C=0 W
LC 1400001
DAT1=C 10
% put data to be reversed in C.A
D0=D0+5
C=DAT0 A
D0=C
D0=D0+5
C=DAT0 A
% reverse 5 nibbles of C.A into D.A
LA 04
B=A B
{
P=C 0
LA 84C2A6E195D3B7F0
P=0
DSL A
D=A P
CSR A
B=B-1 B
UPNC
}
C=D A
% write the reformatted bits into the GROB
D1=D1+10
DAT1=C A
ENDCODE
( drop the number, leave the GROB )
SWAPDROP
;
;
ASSEMBLE
Dir <BPRM>
RPL
DEFINE   GetRowCount             2GETLAM
DEFINE   GetCurPrime             1GETLAM
DEFINE   PutCurPrime             1PUTLAM
::
( check for 1 parameter )
CK1NOLASTWD
CK&DISPATCH1   ( zint will be converted to real )
real  ( anything other than zint or real will generate error )
::
%ABSCOERCE              ( no negative num, convert to system binary )
Z1_                     ( place zint 1 on stack as the initial prime )
ZEROZEROTWO DOBIND      ( bind local vars [unnamed] )
( main loop )
GetRowCount #1+_ONE_DO (DO)   ( do for each row )
( note that the first prime converted will actually be 2 instead of 1 )
( this is to make the output match that of Gerson's posting )
GetCurPrime FLASHPTR Prime+ DUP PutCurPrime  ( get the next prime )
ID Num2Grob                ( convert the current prime to a GROB )
FLASHPTR GROBADDext        ( add the new GROB to the one on the stack )
LOOP
ABND  ( "unbind" the local vars )
;
;
ASSEMBLE
Dir <TSTBPRM>
RPL
::
CK0NOLASTWD    ( no args expected )
SysTime        ( beginning timestamp )
% 128 ID BPRM  ( make grob for 128 primes )
SysTime        ( ending timestamp )
( leave GROB on stack, find difference in timestamps and convert to seconds )
ROT bit- HXS>% % 8192 %/
;
```

Quote:
FWIW,

```NR:
%%HP: T(3)A(D)F(,);
\<< B\->R DUP 4 MOD DUP
2 MOD 15 * SWAP 1 >
6 * + OVER + 2 /
SWAP 7 > 3 * - R\->B
\>>
```

This reverses the bits of a nibble.
For example:

```# 1101b NR   -->  # 1011b
# 0100b NR   -->  # 10b
# 0110b NR   -->  # 110b
Size    ChkSum
-----+-------+--------
48GX |  83   | # 1613h
49G  |  79   | # 5816h
50g  |  79   | # 5816h
```

Gerson.

For a single nibble, why not just do:

```\<< # 1111b XOR />>>
```

For longer words, just use 2^n-1 to get an appropriately large integer to convert into binary form, and again use XOR

Quote:
For a single nibble, why not just do:

```\<< # 1111b XOR />>>
```

Han,

that inverts the nibble, not reverse it. For example:

```# 1101b # 1111b XOR   -->  # 10b, not # 1011b as we want.
```

Regards,

Gerson.

Really impressive, about ten times as fast!
Results of three consecutive runs of TSTBPRM on my HP 50g:

```4.07006835938
4.29943847656
4.08447265625
```
As a comparison, the UserRPL version returns
```42.118775
42.271485
42.263916
```
Quote:
```      GetRowCount #1+_ONE_DO (DO)   ( do for each row )
( note that the first prime converted will actually be 2 instead of 1 )
```

There was a discussion here about the primality (or non-primality) of 1 about three years ago:

Quote:
The source code follows, exactly as it was compiled by Debug4x. If you want to try this with the built-in assembler (ala menu 256), you'll have to pull it apart into separate pieces.

Indeed easier using Debug4. Thanks for the tip!

Gerson.

Quote:
Thanks to all that posted dif ideas,

I second that! That's what I like in this forum: For every occasional small contribution, I've always been given many more in return, and much more valuable ones!