This is sort of apples and oranges, but this is a block copy subroutine for the 6502:
0344 A0 00 [2] 00498 MovBlk ldy #0 ; Start of first page
00499
0346 A6 07 [3] 00500 ldx Int0+1 ; At least one full page remaining?
0348 F0 07 (0351) [2/3] 00501 beq MovBlk0 ; Branch if no
00502
034A A2 00 [2] 00503 ldx #0 ; Set up to move an entire page
034C C6 07 [5] 00504 dec Int0+1 ; One fewer whole page remaining
00505
034E 4C 0357 [3] 00506 jmp MovBlk1
00507
0351 A6 06 [3] 00508 MovBlk0 ldx Int0 ; Any remaining on a partial page?
0353 F0 0F (0364) [2/3] 00509 beq MovBlk2 ; No
00510
0355 84 06 [3] 00511 sty Int0 ; This will finish the partial page
00512
0357 B1 10 [5/6] 00513 MovBlk1 lda (Ptr0),Y ; Move a byte
0359 91 12 [6] 00514 sta (Ptr1),Y
035B C8 [2] 00515 iny
035C CA [2] 00516 dex ; More to move on this page?
035D D0 F8 (0357) [2/3] 00517 bne MovBlk1 ; Yes
00518
035F E6 11 [5] 00519 inc Ptr0+1 ; Address next page
00520
0361 4C 0344 [3] 00521 jmp MovBlk ; Check for another page
00522
0364 60 [6] 00523 MovBlk2 rts
While this is the one for the 6800:
0171 B6 016F [4] 00213 Copy_ ldaa CopyC_
0174 BA 0170 [4] 00214 oraa CopyC_+1
0177 27 21 (019A) [4] 00215 beq Copy2_
0179 FE 016B [5] 00216 Copy1_ ldx CopyS_
017C A6 00 [5] 00217 ldaa ,X
017E 08 [4] 00218 inx
017F FF 016B [6] 00219 stx CopyS_
0182 FE 016D [5] 00220 ldx CopyD_
0185 A7 00 [6] 00221 staa ,X
0187 08 [4] 00222 inx
0188 FF 016D [6] 00223 stx CopyD_
018B 7A 0170 [6] 00224 dec CopyC_+1
018E 26 E9 (0179) [4] 00225 bne Copy1_
0190 7D 016F [6] 00226 tst CopyC_
0193 27 05 (019A) [4] 00227 beq Copy2_
0195 7A 016F [6] 00228 dec CopyC_
0198 26 DF (0179) [4] 00229 bne Copy1_
019A 39 [5] 00230 Copy2_ rts
The 6800 code does not use variables in the direct page; if it did, it would take one fewer cycle for each instruction which read or wrote a variable (Copy?_)
Edit: …and for the 8080:
0000 00001 BlkMov:
0000 2A 0017 [16] 00002 lhld Src
0003 EB [4] 00003 xchg
0004 2A 0019 [16] 00004 lhld Count
0007 44 [5] 00005 mov B,H
0008 4D [5] 00006 mov C,L
0009 2A 0015 [16] 00007 lhld Dest
00008
000C 00009 Loop:
000C 1A [7] 00010 ldax D
000D 77 [7] 00011 mov M,A
000E 23 [5] 00012 inx H
000F 13 [5] 00013 inx D
0010 0B [5] 00014 dcx B
0011 C2 000C [10] 00015 jnz Loop
00016
0014 C9 [10] 00017 ret
Further edit; and for the AVR, the controller on the arduino:
000000 9610 [2] 00005 Copy: adiw R26,0
000001 F041=00000A [1/2] 00006 breq Copy2
000002 E161 [1] 00007 ldi R22,high(SRAM_START+SRAM_SIZE)
000003 30E0 [1] 00008 cpi R30,low(SRAM_START+SRAM_SIZE)
000004 07F6 [1] 00009 cpc R31,R22
000005 F428=00000B [1/2] 00010 brcc Copy3
000006 9161 [2] 00011 Copy1: ld R22,Z+
000007 9369 [2] 00012 st Y+,R22
000008 9711 [2] 00013 sbiw R26,1
000009 F7E1=000006 [1/2] 00014 brne Copy1
00000A 9508 [2] 00015 Copy2: ret
00000B 9165 [3] 00016 Copy3: lpm R22,Z+
00000C 9369 [2] 00017 st Y+,R22
00000D 9711 [2] 00018 sbiw R26,1
00000E F7E1=00000B [1/2] 00019 brne Copy3
00000F 9508 [2] 00020 ret
Edit once again; and the the champion of the 8-bitters, the 6809;
1 0000 FE 0015 Copy ldu Src ; 6 cycles
2 0003 10BE 0017 ldy Dest ; 7 cycles
3 0007 BE 0019 ldx Count ; 6 cycles
4 000A 27 08 beq Done ; 2 cycles
5
6 000C A6 C0 Loop lda ,U+ ; 4+2 cycles
7 000E A7 A0 sta ,Y+ ; 5+2 cycles
8 0010 30 1F leax -1,X ; 4+1 cycles
9 0012 26 F8 bne Loop ; 2 cycles
10
11 0014 39 Done rts ; 5 cycles