[Project Log] Python on the 6502/C64, 8080, 6800, 6809 and AVR

Thinking about the relative strengths of the 6502 always returns focus to the zero page.

Within the zero page, the 6502 indexing modes do behave like the 6800 index with displacement mode, with dramatic results. Taking liberties that the list need not be preserved after traversal, it is handled in parts which do fit in the zero page.

                          00004 ; 328611 ($503A3) cycles
                          00005 ; 256134 ($3E886) cycles after inlining subroutines
                          00006
 0000 08                  00007 Free     fcb    Heap
 0001 4B                  00008 J        fcb    75        ; Inner loop runs 75 times
 0002 0000 0000           00009 Total    fdb    0,0
 0006 0BB7                00010 I        fdb    2999
                          00011
 0008                     00012 Heap:
                          00013
 0200                     00014          org    $200
                          00015
                          00016 ;
                          00017 ; Allocate a node
                          00018 ;
 0200                     00019 Alloc:
 0200 A5 00           [3] 00020          lda    Free      ; Allocate a node
 0202 AA              [2] 00021          tax              ; Return it in X
                          00022
 0203 18              [2] 00023          clc              ; Point to free memory
 0204 69 03           [2] 00024          adc    #3
 0206 85 00           [3] 00025          sta    Free
                          00026
 0208 60              [6] 00027          rts
                          00028
                          00029 ;
                          00030 ; Prepend a node to the list
                          00031 ;
 0209                     00032 Prepend:
 0209 20 0200         [6] 00033          jsr    Alloc     ; Allocate a new node
                          00034
 020C 94 00           [4] 00035          sty    0,X       ; Point next of new node to Root
                          00036
 020E 8A              [2] 00037          txa              ; This is now the first node
 020F A8              [2] 00038          tay
                          00039
 0210 A5 06           [3] 00040          lda    I         ; Store value in node
 0212 95 01           [4] 00041          sta    1,X
 0214 A5 07           [3] 00042          lda    I+1
 0216 95 02           [4] 00043          sta    2,X
                          00044
 0218 60              [6] 00045          rts
                          00046
                          00047 ;
                          00048 ; Traverse the list and sum the values
                          00049 ;
 0219                     00050 NoCarry:
 0219 B9 0000       [4/5] 00051          lda    0,Y       ; Point to the next node
                          00052
 021C F0 1A (0238)  [2/3] 00053          beq    Summed    ; Until end of list
                          00054
 021E A8              [2] 00055          tay
                          00056
 021F                     00057 Sum:
 021F 18              [2] 00058          clc              ; Add value of node to sum
 0220 A5 02           [3] 00059          lda    Total
 0222 79 0001       [4/5] 00060          adc    1,Y
 0225 85 02           [3] 00061          sta    Total
 0227 A5 03           [3] 00062          lda    Total+1
 0229 79 0002       [4/5] 00063          adc    2,Y
 022C 85 03           [3] 00064          sta    Total+1
 022E 90 E9 (0219)  [2/3] 00065          bcc    NoCarry
                          00066
 0230 E6 04           [5] 00067          inc    Total+2
                          00068
 0232 D0 E5 (0219)  [2/3] 00069          bne    NoCarry
                          00070
 0234 E6 05           [5] 00071          inc    Total+3
                          00072
 0236 D0 E1 (0219)  [2/3] 00073          bne    NoCarry   ; Always branches
                          00074
 0238                     00075 Summed:
 0238 60              [6] 00076          rts
                          00077
                          00078 ;
                          00079 ;
                          00080 ;
 0239                     00081 RepeatOuterLoop:
 0239 20 021F         [6] 00082          jsr    Sum       ; Traverse the list and add them up
                          00083
 023C A0 00           [2] 00084          ldy    #0        ; Clear list
                          00085
 023E A9 08           [2] 00086          lda    #Heap     ; Reset the heap
 0240 85 00           [3] 00087          sta    Free
                          00088
 0242 A9 4B           [2] 00089          lda    #75       ; Run inner loop 75 times
 0244 85 01           [3] 00090          sta    J
                          00091
 0246 D0 06 (024E)  [2/3] 00092          bne    InnerLoop ; Always branches
                          00093
 0248                     00094 NoBorrow:
 0248 C6 06           [5] 00095          dec    I
                          00096
 024A C6 01           [5] 00097          dec    J
 024C F0 EB (0239)  [2/3] 00098          beq    RepeatOuterLoop
                          00099
 024E                     00100 Main:
 024E                     00101 InnerLoop:
 024E 20 0209         [6] 00102          jsr    Prepend
                          00103
 0251 A5 06           [3] 00104          lda    I         ; Low byte of count zero?
 0253 D0 F3 (0248)  [2/3] 00105          bne    NoBorrow  ; No, no borrow
                          00106
 0255 C6 07           [5] 00107          dec    I+1       ; Borrow from upper byte
 0257 10 EF (0248)  [2/3] 00108          bpl    NoBorrow  ; Done if it goes negative
                          00109
 0259 20 021F         [6] 00110          jsr    Sum       ; Traverse the list and add them up
                          00111
 025C 00              [7] 00112          brk
                          00113
 024E                     00114          end    Main

What is good for the goose is good for the gander and the gosling. The door is now open for the others to exploit their zero pages in the same manner.

Unfortunately, the 6800 cannot load the index register from a single byte. Its minor gain does not save it from last place.

                          00004 * 366717 ($5987D) cycles
                          00005 * 288067 ($46543) cycles after inlining subroutines
                          00006
 0000 000D                00007 Free     fdb    Heap
 0002 0000                00008 Root     fdb    0
 0004 0000                00009 New      fdb    0
 0006 0000 0000           00010 Total    fdb    0,0
 000A 3C                  00011 J        fcb    60        ; Inner loop runs 60 times
 000B 0BB7                00012 I        fdb    2999
                          00013
 000D                     00014 Heap     rmb    4*60
                          00015
                          00016 *
                          00017 * Allocate a node
                          00018 *
 00FD                     00019 Alloc
 00FD D6 01           [3] 00020          ldab   Free+1    ; Allocate a node
 00FF D7 05           [4] 00021          stab   New+1     ; Return it in New
                          00022
 0101 CB 04           [2] 00023          addb   #4        ; Point to free memory
 0103 D7 01           [4] 00024          stab   Free+1
                          00025
 0105 39              [5] 00026          rts
                          00027
                          00028 *
                          00029 * Prepend a node to the list
                          00030 *
 0106                     00031 Prepend
 0106 8D F5 (00FD)    [8] 00032          bsr    Alloc     ; Allocate a new node
                          00033
 0108 DE 04           [4] 00034          ldx    New       ; Point to the new node
                          00035
 010A D6 03           [3] 00036          ldab   Root+1    ; Point next of new node to Root
 010C E7 01           [6] 00037          stab   1,X
 010E 6F 00           [7] 00038          clr    ,X
                          00039
 0110 A7 03           [6] 00040          staa   3,X       ; Store value in node
 0112 D6 0B           [3] 00041          ldab   I
 0114 E7 02           [6] 00042          stab   2,X
                          00043
 0116 DF 02           [5] 00044          stx    Root      ; This is now the first node
                          00045
 0118 39              [5] 00046          rts
                          00047
                          00048 *
                          00049 * Traverse the list and sum the values
                          00050 *
 0119                     00051 Sum
 0119 DE 02           [4] 00052          ldx    Root
 011B 96 08           [3] 00053          ldaa   Total+2
 011D D6 09           [3] 00054          ldab   Total+3
                          00055
 011F                     00056 SumLoop
 011F EB 03           [5] 00057          addb   3,X       ; Add value of node to sum
 0121 A9 02           [5] 00058          adca   2,X
 0123 24 08 (012D)    [4] 00059          bcc    NoCarry
                          00060
 0125 7C 0007         [6] 00061          inc    Total+1
                          00062
 0128 26 03 (012D)    [4] 00063          bne    NoCarry
                          00064
 012A 7C 0006         [6] 00065          inc    Total
                          00066
 012D                     00067 NoCarry
 012D EE 00           [6] 00068          ldx    ,X        ; Point to the next node
 012F 26 EE (011F)    [4] 00069          bne    SumLoop   ; Until end of list
                          00070
 0131 97 08           [4] 00071          staa   Total+2
 0133 D7 09           [4] 00072          stab   Total+3
                          00073
 0135 39              [5] 00074          rts
                          00075
                          00076 *
                          00077 *
                          00078 *
 0136                     00079 RepeatOuterLoop
 0136 97 0C           [4] 00080          staa   I+1       ; Save low byte of count
                          00081
 0138 8D DF (0119)    [8] 00082          bsr    Sum       ; Traverse the list and add them up
                          00083
 013A DF 02           [5] 00084          stx    Root      ; Clear list
                          00085
 013C 86 0D           [2] 00086          ldaa   #Heap     ; Reset the heap
 013E 97 01           [4] 00087          staa   Free+1
                          00088
 0140 86 3C           [2] 00089          ldaa   #60       ; Run inner loop 60 times
 0142 97 0A           [4] 00090          staa   J
                          00091
 0144                     00092 Main
 0144                     00093 InnerLoop
 0144 96 0C           [3] 00094          ldaa   I+1       ; Get low byte of node value
                          00095
 0146 20 06 (014E)    [4] 00096          bra    IntoLoop
                          00097
 0148                     00098 NoBorrow
 0148 4A              [2] 00099          deca
                          00100
 0149 7A 000A         [6] 00101          dec    J
 014C 27 E8 (0136)    [4] 00102          beq    RepeatOuterLoop
                          00103
 014E                     00104 IntoLoop
 014E 8D B6 (0106)    [8] 00105          bsr    Prepend
                          00106
 0150 4D              [2] 00107          tsta             ; Low byte of count zero?
 0151 26 F5 (0148)    [4] 00108          bne    NoBorrow  ; No, no borrow
                          00109
 0153 7A 000B         [6] 00110          dec    I         ; Borrow from upper byte
 0156 2A F0 (0148)    [4] 00111          bpl    NoBorrow  ; Done if it goes negative
                          00112
 0158 8D BF (0119)    [8] 00113          bsr    Sum       ; Traverse the list and add them up
                          00114
 015A 3F             [12] 00115          swi
                          00116
 0144                     00117          end    Main

Even though the 8080 architecture does not directly advantage the first 256 bytes of its memory space, single-byte addresses do present a huge opportunity in light of the clumsy addressing capabilities. Some tricky coding keeps the 8080 solidly in second place.

                          00004 ; 757274 (0B8E1Ah) cycles
                          00005 ; 594194 (091112h) cycles after inlining subroutines
                          00006
 0000 0000 0000           00007 Total   dw      0,0
 0004 0BB7                00008 I       dw      2999
                          00009
 0006                     00010 Heap    ds      3*75
                          00011
                          00012 ;
                          00013 ; Allocate a node
                          00014 ;
 00E7                     00015 Alloc:
 00E7 6B              [5] 00016         mov     L,E             ; Allocate a node to return in HL
                          00017
 00E8 1C              [5] 00018         inr     E               ; Point to free memory
 00E9 1C              [5] 00019         inr     E
 00EA 1C              [5] 00020         inr     E
                          00021
 00EB C9             [10] 00022         ret
                          00023
                          00024 ;
                          00025 ; Prepend a node to the list
                          00026 ;
 00EC                     00027 Prepend:
 00EC CD 00E7        [17] 00028         call    Alloc           ; Allocate a new node
                          00029
 00EF 72              [7] 00030         mov     M,D             ; Point next of new node to Root
                          00031
 00F0 55              [5] 00032         mov     D,L             ; This is the new first node
                          00033
 00F1 23              [5] 00034         inx     H               ; Store value in node
 00F2 71              [7] 00035         mov     M,C
 00F3 23              [5] 00036         inx     H
 00F4 3A 0005        [13] 00037         lda     I+1
 00F7 77              [7] 00038         mov     M,A
                          00039
 00F8 C9             [10] 00040         ret
                          00041
                          00042 ;
                          00043 ; Traverse the list and sum the values
                          00044 ;
 00F9                     00045 Sum:
 00F9 6A              [5] 00046         mov     L,D             ; Point HL to first node
 00FA EB              [4] 00047         xchg
 00FB 2A 0000        [16] 00048         lhld    Total           ; Load cumulative total
 00FE EB              [4] 00049         xchg
                          00050
 00FF                     00051 SumLoop:
 00FF 23              [5] 00052         inx     H               ; Add value of node to sum
 0100 4E              [7] 00053         mov     C,M
 0101 23              [5] 00054         inx     H
 0102 46              [7] 00055         mov     B,M
 0103 EB              [4] 00056         xchg
 0104 09             [10] 00057         dad     B
 0105 EB              [4] 00058         xchg
 0106 D2 010E        [10] 00059         jnc     NoCarry
                          00060
 0109 7D              [5] 00061         mov     A,L             ; Carry to high high word
 010A 2E 02           [7] 00062         mvi     L,Total+2
 010C 34             [10] 00063         inr     M
 010D 6F              [5] 00064         mov     L,A
                          00065
 010E                     00066 NoCarry:
 010E 2B              [5] 00067         dcx     H               ; Point to the next node
 010F 2B              [5] 00068         dcx     H
 0110 6E              [7] 00069         mov     L,M
                          00070
 0111 7D              [5] 00071         mov     A,L
 0112 B5              [4] 00072         ora     L               ; Until end of list
 0113 C2 00FF        [10] 00073         jnz     SumLoop
                          00074
 0116 EB              [4] 00075         xchg
 0117 22 0000        [16] 00076         shld    Total
 011A EB              [4] 00077         xchg                    ; Leave 0 in HL
                          00078
 011B C9             [10] 00079         ret
                          00080
                          00081 ;
                          00082 ;
                          00083 ;
 011C                     00084 Main:
 011C 26 00           [7] 00085         mvi     H,0
                          00086
 011E 3A 0004        [13] 00087         lda     I
 0121 4F              [5] 00088         mov     C,A
                          00089
 0122 C3 012A        [10] 00090         jmp     IntoOuterLoop
                          00091
 0125                     00092 RepeatOuterLoop:
 0125 C5             [11] 00093         push    B
                          00094
 0126 CD 00F9        [17] 00095         call    Sum             ; Traverse the list and add them up
                          00096
 0129 C1             [10] 00097         pop     B
                          00098
 012A                     00099 IntoOuterLoop:
 012A 06 4B           [7] 00100         mvi     B,75
                          00101
 012C 11 0006        [10] 00102         lxi     D,Heap          ; Reset the heap and clear list
                          00103
 012F C3 0137        [10] 00104         jmp     IntoLoop
                          00105
 0132                     00106 NoBorrow:
 0132 0D              [5] 00107         dcr     C
                          00108
 0133 05              [5] 00109         dcr     B
 0134 CA 0125        [10] 00110         jz      RepeatOuterLoop
                          00111
 0137                     00112 IntoLoop:
 0137 CD 00EC        [17] 00113         call    Prepend
                          00114
 013A 79              [5] 00115         mov     A,C             ; Low byte of count zero?
 013B B1              [4] 00116         ora     C
 013C C2 0132        [10] 00117         jnz     NoBorrow        ; No, no borrow
                          00118
 013F 2E 05           [7] 00119         mvi     L,I+1           ; Borrow from upper byte
 0141 35             [10] 00120         dcr     M
 0142 F2 0132        [10] 00121         jp      NoBorrow        ; Done if it goes negative
                          00122
 0145 CD 00F9        [17] 00123         call    Sum             ; Traverse the list and add them up
                          00124
 0148 76              [7] 00125         hlt
                          00126
 011C                     00127         end     Main
1 Like

A change in register usage narrows the gap but does not alter the standings.

                          00007 * 360323 ($57F83) cycles
                          00008 * 281673 ($44C49) cycles after inlining subroutines
                          00009
 0000 000D                00010 Free     fdb    Heap
 0002 0000                00011 Root     fdb    0
 0004 0000                00012 New      fdb    0
 0006 0000 0000           00013 Total    fdb    0,0
 000A 3C                  00014 J        fcb    60        ; Inner loop runs 60 times
 000B 0BB7                00015 I        fdb    2999
                          00016
 000D                     00017 Heap     rmb    4*60
                          00018
                          00019 *
                          00020 * Allocate a node
                          00021 *
 00FD                     00022 Alloc
 00FD 97 05           [4] 00023          staa   New+1     ; Return it in New
                          00024
 00FF 8B 04           [2] 00025          adda   #4        ; Point to free memory
                          00026
 0101 39              [5] 00027          rts
                          00028
                          00029 *
                          00030 * Prepend a node to the list
                          00031 *
 0102                     00032 Prepend
 0102 8D F9 (00FD)    [8] 00033          bsr    Alloc     ; Allocate a new node
                          00034
 0104 DE 04           [4] 00035          ldx    New       ; Point to the new node
                          00036
 0106 E7 03           [6] 00037          stab   3,X       ; Store value in node
 0108 D6 0B           [3] 00038          ldab   I
 010A E7 02           [6] 00039          stab   2,X
                          00040
 010C D6 03           [3] 00041          ldab   Root+1    ; Point next of new node to Root
 010E E7 01           [6] 00042          stab   1,X
 0110 6F 00           [7] 00043          clr    ,X
                          00044
 0112 DF 02           [5] 00045          stx    Root      ; This is now the first node
                          00046
 0114 39              [5] 00047          rts
                          00048
                          00049 *
                          00050 * Traverse the list and sum the values
                          00051 *
 0115                     00052 Sum
 0115 DE 02           [4] 00053          ldx    Root
 0117 96 08           [3] 00054          ldaa   Total+2
 0119 D6 09           [3] 00055          ldab   Total+3
                          00056
 011B                     00057 SumLoop
 011B EB 03           [5] 00058          addb   3,X       ; Add value of node to sum
 011D A9 02           [5] 00059          adca   2,X
 011F 24 08 (0129)    [4] 00060          bcc    NoCarry
                          00061
 0121 7C 0007         [6] 00062          inc    Total+1
                          00063
 0124 26 03 (0129)    [4] 00064          bne    NoCarry
                          00065
 0126 7C 0006         [6] 00066          inc    Total
                          00067
 0129                     00068 NoCarry
 0129 EE 00           [6] 00069          ldx    ,X        ; Point to the next node
 012B 26 EE (011B)    [4] 00070          bne    SumLoop   ; Until end of list
                          00071
 012D 97 08           [4] 00072          staa   Total+2
 012F D7 09           [4] 00073          stab   Total+3
                          00074
 0131 39              [5] 00075          rts
                          00076
                          00077 *
                          00078 *
                          00079 *
 0132                     00080 RepeatOuterLoop
 0132 8D E1 (0115)    [8] 00081          bsr    Sum       ; Traverse the list and add them up
                          00082
 0134 DF 02           [5] 00083          stx    Root      ; Clear list
                          00084
 0136 86 3C           [2] 00085          ldaa   #60       ; Run inner loop 60 times
 0138 97 0A           [4] 00086          staa   J
                          00087
 013A                     00088 Main
 013A 86 0D           [2] 00089          ldaa   #Heap     ; Reset the heap
                          00090
 013C D6 0C           [3] 00091          ldab   I+1       ; Load low byte of count
                          00092
 013E 20 08 (0148)    [4] 00093          bra    IntoLoop
                          00094
 0140                     00095 NoBorrow
 0140 5A              [2] 00096          decb
 0141 D7 0C           [4] 00097          stab   I+1
                          00098
 0143 7A 000A         [6] 00099          dec    J
 0146 27 EA (0132)    [4] 00100          beq    RepeatOuterLoop
                          00101
 0148                     00102 IntoLoop
 0148 8D B8 (0102)    [8] 00103          bsr    Prepend
                          00104
 014A D6 0C           [3] 00105          ldab   I+1       ; Low byte of count zero?
 014C 26 F2 (0140)    [4] 00106          bne    NoBorrow  ; No, no borrow
                          00107
 014E 7A 000B         [6] 00108          dec    I         ; Borrow from upper byte
 0151 2A ED (0140)    [4] 00109          bpl    NoBorrow  ; Done if it goes negative
                          00110
 0153 8D C0 (0115)    [8] 00111          bsr    Sum       ; Traverse the list and add them up
                          00112
 0155 3F             [12] 00113          swi
                          00114
 013A                     00115          end    Main
1 Like

I have been working on the heap manager for the ARM (Raspberry Pi) lately.

Many of the instructions are three operand:

add <destination>,<source 1>,<source 2>

Sometimes it is useful to leave the source operands unaltered.

The assembler allows shortening that to:

add <destination>,<source>

when one of the source operands is also the destination.

Of course assembly language programming involves the use of a debugger. On the Raspberry Pi, that would be GDB.

GDB commands can be very wordy.

Fortunately, many can be abbreviated nicely:

r for run
c for continue
i r for info register
si for step a single machine instruction
ni for step over a subroutine call

GDB is also nice in that it provides command history and just hitting <Enter> repeats the last command.

1 Like

It is December and time again for the annual Vintage Computing Christmas Challenge:

I have entered in the past, but did not last year because I found the challenge, shall we say, uninspiring.

As usual, I will begin by coding up a solution in 6800 assembly language. Then go to the 6809 to see if that can improve it. Finally create 6502 and 8080 versions for comparison.

I do not like that the judging criteria are:

The shortest size of the following provided sizes is taken for the final “ordering/ranking”:

  • Source code
  • File size
  • Executed Code (net[to] code size; “real” code)

So I will be submitting both commented and uncommented source code…