[Project Log] Python on the 6502/C64, 8080, 6800, 6809 and AVR


#202

Project scope changed from “toy Python compiler” to “Fierce Python compiler for all the cool kid micros?”

Or maybe the simpler: Another one of my projects that is almost 2 years old?


#203

Yeah, you targetted the wrong language for your project.

FORTH rules all.


#204

Do you mean compile Python code into FORTH or compile FORTH programs?

I have written FORTH interpreters for several processors and had thought about implementing a metacompiler.


#205

Fifth-generation programming language paradigm looks interesting.


#206

It is still a toy because it cannot handle but the simplest programs.

  • No lists, tuples or dictionaries
  • No indexing or slicing
  • No iterators
  • No floating point
  • Cannot define functions
  • No exception handling
  • Cannot import libraries

#207

Implementing functions will require restructuring the compiler to make two passes while parsing a function or creating a parse tree and using that to generate code. That will have to be done at a later date.

In the meantime, structured exception handling can be implemented with one simplification, exceptions will be numbers instead of classes. This means that custom exceptions cannot be defined, but exceptions from the runtime library can be intercepted.


#208

This is the first of possibly several posts on the design and implementation of an exception handling subsystem for Python.

For starters, please read the Python error handling documentation:

https://docs.python.org/3/tutorial/errors.html

My pseudocode is:

try:
	push context
	point handler vector to before first except clause
	execute code suite
	once break is encountered:
		pop context
		call finally clause
		do the break
	once continue is encountered:
		pop context
		call finally clause
		do the continue
	once return is encountered:
		pop context
		call finally clause
		do the return
	once raise is encountered:
		pop context
		call finally clause
		do the raise
	pop context
	goto else clause

before first except:
	pop context
	fall through to first except clause

except <exception>:
	if <exception>:
		execute code suite
		once return is encountered:
			call finally clause
			do the return
		once raise is encountered:
			call finally clause
			do the raise
		goto finally clause
	fall through to next except clause

except <no exception specified>:
	execute code suite
	once return is encountered:
		call finally clause
		do the return
	once raise is encountered:
		call finally clause
		do the raise
	call finally clause
	goto end_try

after last except clause:
	call finally clause
	do a raise

else:
	execute code suite
	once return is encountered:
		call finally clause
		do the return
	once raise is encountered:
		call finally clause
		do the raise
	call finally clause
	goto end_try

finally:
	execute code suite
	return

end_try:

Did I miss anything?


#209

I just found and fixed a rather obscure bug in my 6809 cross assembler (and also the single-line assembler in the emulator.)

Some instructions of the 6809 processor have two bytes of opcode instead of one. The indexed mode instructions use a postbyte to indicate the specific addressing mode along with up to two additional bytes for an optional offset or address.

The problem is with the program counter relative versions of these instructions. The stored offset is the number of bytes from the first byte of the following instruction to the target address.

I have a routine to parse the operand and generate the postbyte and any additional bytes. For most addressing modes, it does not matter whether the base instruction is one or two bytes. But that is an important detail for the program counter relative form.

 0000 10A3 8D 0000 (0005)    [12] 00001	         cmpd   C34A,PCR
 								  00002
 0005							  00003	C34A     rmb    1

#210

To my surprise, this bug generated a lively discussion on the FLEX User Group e-mail list.

As my assembler does not allow forcing the size, I came up with source to try all of the possibilities:

 org 0

Before

 cmpa Before,PCR
 cmpa After,PCR
 cmpa Last,PCR

 cmpd Last,PCR
 cmpd Before,PCR
 cmpd After,PCR

After rmb 1

 org $200

Later
 cmpa Before,PCR
 cmpa Later,PCR
 cmpa Last,PCR

 cmpd Before,PCR
 cmpd Later,PCR
 cmpd Last,PCR

Last

 end

The TSC FLEX assembler does this:

    1   0000                       org    0
    2                      
    3   0000               Before
    4                      
    5   0000 A1   8C FD            cmpa   Before,PCR
    6   0003 A1   8D 0012          cmpa   After,PCR
    7   0007 A1   8D 020E          cmpa   Last,PCR
    8                      
    9   000B 10A3 8D 0209          cmpd   Last,PCR
   10   0010 10A3 8C EC            cmpd   Before,PCR
   11   0014 10A3 8D 0000          cmpd   After,PCR
   12                      
   13   0019               After   rmb    1
   14                      
   15   0200                       org    $200
   16                      
   17   0200               Later
   18   0200 A1   8D FDFC          cmpa   Before,PCR
   19   0204 A1   8C F9            cmpa   Later,PCR
   20   0207 A1   8D 000E          cmpa   Last,PCR
   21                      
   22   020B 10A3 8D FDF0          cmpd   Before,PCR
   23   0210 10A3 8C EC            cmpd   Later,PCR
   24   0214 10A3 8D 0000          cmpd   Last,PCR
   25                      
   26   0219               Last
   27                      

And my assembler does this:

 0000							  00001	         org    0
 								  00002
 0000							  00003	Before
 								  00004
 0000 A1 8C FD (0000)	      [5] 00005	         cmpa   Before,PCR
 0003 A1 8D 0012 (0019)	      [9] 00006	         cmpa   After,PCR
 0007 A1 8D 020E (0219)	      [9] 00007	         cmpa   Last,PCR
 								  00008
 000B 10A3 8D 0209 (0219)    [12] 00009	         cmpd   Last,PCR
 0010 10A3 8C EC (0000)	      [8] 00010	         cmpd   Before,PCR
 0014 10A3 8D 0000 (0019)    [12] 00011	         cmpd   After,PCR
 								  00012
 0019							  00013	After    rmb    1
 								  00014
 0200							  00015	         org    $200
 								  00016
 0200							  00017	Later
 0200 A1 8D FDFC (0000)	      [9] 00018	         cmpa   Before,PCR
 0204 A1 8C F9 (0200)	      [5] 00019	         cmpa   Later,PCR
 0207 A1 8D 000E (0219)	      [9] 00020	         cmpa   Last,PCR
 								  00021
 020B 10A3 8D FDF0 (0000)    [12] 00022	         cmpd   Before,PCR
 0210 10A3 8C EC (0200)	      [8] 00023	         cmpd   Later,PCR
 0214 10A3 8D 0000 (0219)    [12] 00024	         cmpd   Last,PCR
 								  00025
 0219							  00026	Last
 								  00027
 								  00028	         end

TSC and I agree.


#211

After a couple of intense days of hacking, this code compiles and runs:

Choice = ' '
while Choice != '':
	try:
		print()
		print('Your choices are:')
		print('  0 - divide by zero')
		print()
		Choice = input('So which is it? ')

		if Choice == '0':
			A = 0
			print(6//A)
		elif Choice != '':
			print('I do not understand.')
	except:
		print('Caught ZeroDivisionError')

There is still a long way to go, but that was a major step.


#212

A peek under the hood…

This is a skeleton program:

try:
	# main code suite
	pass
except:
	# except code suite
	pass
else:
	# else code suite
	pass
finally:
	# finally code suite
	pass

This is the raw assembly listing:

 						  00177	; 00001	try:
 023B 20 0694	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		# main code suite
 						  00184	; 00003		pass
 						  00185	; 00004	except:
 0246 20 06AE	      [6] 00186		jsr	PopExceptionContext
 0249 4C 0255	      [3] 00187		jmp	L00001
 024C					  00188	L00000
 024C 20 06AE	      [6] 00189		jsr	PopExceptionContext
 						  00190	; 00005		# except code suite
 						  00191	; 00006		pass
 						  00192	; 00007	else:
 024F 20 025B	      [6] 00193		jsr	L00002
 0252 4C 025C	      [3] 00194		jmp	L00003
 0255					  00195	L00001
 						  00196	; 00008		# else code suite
 						  00197	; 00009		pass
 						  00198	; 00010	finally:
 0255 20 025B	      [6] 00199		jsr	L00002
 0258 4C 025C	      [3] 00200		jmp	L00003
 025B					  00201	L00002
 						  00202	; 00011		# finally code suite
 						  00203	; 00012		pass
 025B 60		      [6] 00204		rts
 025C					  00205	L00003

This is the listing rearranged slightly to put the lines of Python source as comments near the generated snippets:

 						  00177	; 00001	try:
 023B 20 0694	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		# main code suite
 						  00184	; 00003		pass
 0246 20 06AE	      [6] 00186		jsr	PopExceptionContext
 0249 4C 0255	      [3] 00187		jmp	L00001
 						  00185	; 00004	except:
 024C					  00188	L00000
 024C 20 06AE	      [6] 00189		jsr	PopExceptionContext
 						  00190	; 00005		# except code suite
 						  00191	; 00006		pass
 024F 20 025B	      [6] 00193		jsr	L00002
 0252 4C 025C	      [3] 00194		jmp	L00003
 						  00192	; 00007	else:
 0255					  00195	L00001
 						  00196	; 00008		# else code suite
 						  00197	; 00009		pass
 0255 20 025B	      [6] 00199		jsr	L00002
 0258 4C 025C	      [3] 00200		jmp	L00003
 						  00198	; 00010	finally:
 025B					  00201	L00002
 						  00202	; 00011		# finally code suite
 						  00203	; 00012		pass
 025B 60		      [6] 00204		rts
 025C					  00205	L00003

The compiler is a single-pass recursive descent parser doing syntax-directed code generation. Things like an empty else or finally part ought to be optimized out, but that is difficult to do without multiple passes, building a parse tree and using that to generate code or implementing a post-compile code optimizer.


#213

Last time, the code generated for try/except/else/finally was presented. Here it is again, but with some minor edits of the Python source code to reduce the number of lines and also with Python code comments moved in each assembly language source file for better correlation to the relevant machine code.

try:
	pass	# main code suite
except:
	pass	# except code suite
else:
	pass	# else code suite
finally:
	pass	# finally code suite
 						  00177	; 00001	try:
 023B 20 0691	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06AB	      [6] 00184		jsr	PopExceptionContext
 0249 4C 0255	      [3] 00185		jmp	L00001
 						  00186	; 00003	except:
 024C					  00187	L00000
 024C 20 06AB	      [6] 00188		jsr	PopExceptionContext
 						  00189	; 00004		pass	# except code suite
 024F 20 025B	      [6] 00190		jsr	L00002
 0252 4C 025C	      [3] 00191		jmp	L00003
 						  00192	; 00005	else:
 0255					  00193	L00001
 						  00194	; 00006		pass	# else code suite
 0255 20 025B	      [6] 00195		jsr	L00002
 0258 4C 025C	      [3] 00196		jmp	L00003
 						  00197	; 00007	finally:
 025B					  00198	L00002
 						  00199	; 00008		pass	# finally code suite
 025B 60		      [6] 00200		rts
 025C					  00201	L00003

Moving the call of the finally code to the bottom allows saving an instance of the instruction for each except clause

 						  00177	; 00001	try:
 023B 20 068E	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06A8	      [6] 00184		jsr	PopExceptionContext
 0249 4C 0252	      [3] 00185		jmp	L00001
 						  00186	; 00003	except:
 024C					  00187	L00000
 024C 20 06A8	      [6] 00188		jsr	PopExceptionContext
 						  00189	; 00004		pass	# except code suite
 024F 4C 0256	      [3] 00190		jmp	L00003
 						  00191	; 00005	else:
 0252					  00192	L00001
 						  00193	; 00006		pass	# else code suite
 0252 4C 0256	      [3] 00194		jmp	L00003
 						  00195	; 00007	finally:
 0255					  00196	L00002
 						  00197	; 00008		pass	# finally code suite
 0255 60		      [6] 00198		rts
 0256					  00199	L00003
 0256 20 0255	      [6] 00200		jsr	L00002

More importantly, this seemingly trivial transformation creates significant opportunities for a dumb single-pass compiler with but one lexical token of lookahead to eliminate unneeded elements.

To begin with, consider the case in which the else clause is not present,

 						  00177	; 00001	try:
 023B 20 06A9	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06C3	      [6] 00184		jsr	PopExceptionContext
 0249 4C 0253	      [3] 00185		jmp	L00001
 						  00186	; 00003	except:
 024C					  00187	L00000
 024C 20 06C3	      [6] 00188		jsr	PopExceptionContext
 						  00189	; 00004		pass	# except code suite
 024F 4C 0253	      [3] 00190		jmp	L00001
 						  00191	; 00005	finally:
 0252					  00192	L00002
 						  00193	; 00006		pass	# finally code suite
 0252 60		      [6] 00194		rts
 0253					  00195	L00001
 0253 20 0252	      [6] 00196		jsr	L00002

And the case in which the finally clause is not present.

 						  00177	; 00001	try:
 023B 20 06A5	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06BF	      [6] 00184		jsr	PopExceptionContext
 0249 4C 0252	      [3] 00185		jmp	L00001
 						  00186	; 00003	except:
 024C					  00187	L00000
 024C 20 06BF	      [6] 00188		jsr	PopExceptionContext
 						  00189	; 00004		pass	# except code suite
 024F 4C 0252	      [3] 00190		jmp	L00003
 						  00191	; 00005	else:
 0252					  00192	L00001
 						  00193	; 00006		pass	# else code suite
 0252					  00194	L00003

Note that an empty finally subroutine will be generated if the compiler detects in the try clause a return or a raise statement or a break or continue statement which transfers the flow of control out of the try clause.

Now consider the case in which else and finally are absent. This is one of the two minimal forms of structured exception handling allowed in Python and is perhaps the most common use.

 						  00177	; 00001	try:
 023B 20 06A2	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06BC	      [6] 00184		jsr	PopExceptionContext
 0249 4C 024F	      [3] 00185		jmp	L00001
 						  00186	; 00003	except:
 024C					  00187	L00000
 024C 20 06BC	      [6] 00188		jsr	PopExceptionContext
 						  00189	; 00004		pass	# except code suite
 024F					  00190	L00001

The other minimal form has only the try and finally clauses.

 						  00177	; 00001	try:
 023B 20 06B2	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06CC	      [6] 00184		jsr	PopExceptionContext
 0249 4C 025A	      [3] 00185		jmp	L00001
 024C					  00186	L00000
 024C 20 06CC	      [6] 00187		jsr	PopExceptionContext
 024F 20 0259	      [6] 00188		jsr	L00002
 0252 A6 52		      [3] 00189		ldx	Exception
 0254 A5 53		      [3] 00190		lda	Exception+1
 0256 4C 06AB	      [3] 00191		jmp	Raise
 						  00192	; 00003	finally:
 0259					  00193	L00002
 						  00194	; 00004		pass	# finally code suite
 0259 60		      [6] 00195		rts
 025A					  00196	L00001
 025A 20 0259	      [6] 00197		jsr	L00002

In this case, any exceptions are reraised after the finally clause has been executed.

Not too shabby for a dumb single-pass compiler…


#214

The test code has been slightly restructured to test the compilation of break and continue out of the try clause.

Choice = ' '
while Choice != '':
	try:
		print()
		print('Your choices are:')
		print('  0 - divide by zero')
		print()
		Choice = input('So which is it? ')

		if Choice == '0':
			A = 0
			print(6//A)
		elif Choice == '':
			break
		print('I do not understand.')
		continue
	except: # ZeroDivisionError:
		print('Caught ZeroDivisionError')
	else:
		print('In else')
	finally:
		print('Finally!')

A break statement now compiles to this if it is within the try clause.

 			  00374	; 00014				break
 03C5 20 08B2	      [6] 00375		jsr	PopExceptionContext
 03C8 20 043D	      [6] 00376		jsr	L00006
 03CB 4C 0461	      [3] 00377		jmp	L00003

However, the break and continue statements in a loop fully contained within a try clause like this one do not invoke the finally clause. That was somewhat tricky to implement.

try:
	Choice = ' '
	while Choice != '':
		print()
		print('Your choices are:')
		print('  0 - divide by zero')
		print()
		Choice = input('So which is it? ')

		if Choice == '0':
			A = 0
			print(6//A)
		elif Choice == '':
			break
		print('I do not understand.')
		continue
except: # ZeroDivisionError:
	print('Caught ZeroDivisionError')
else:
	print('In else')
finally:
	print('Finally!')

All in all it’s just another brick in the wall.


#215

From the When it Rains, it Pours Department,

try:
	pass	# main code suite
except:
	pass	# except code suite
else:
	pass	# else code suite
finally:
	pass	# finally code suite

The documentation does not say that break and continue are not allowed in the except, else and finally clauses.

Compiling these statements within an except or else clause need not (and should not) pop the exception stack. But they should still invoke the finally clause.

Compiling these statements within the finally clause should do neither.

Edit: One final detail. Because it was compiled as a subroutine, any flow of control out of the finally code suite other than returning at the end must remove the return address from the stack. Discovered that one the hard way…


#216

After the extensive work on exception handling, I indulged in a couple of rabbit holes.

The first is writing a “small” emulator to enable running programs written for the SWTPC 6800 on a system based on the 6809. This is possible because a “full boat” 6809 system contains 56K of RAM while the maximum is 40K for the 6800.

That gives me a whopping 16K of memory for the emulator. I map the I/O devices from $80xx on the 6800 to $E0xx on the 6809, and thunk calls to the FLEX operating system from $ADxx to $CDxx.

At this point, I can run the assembler and the BASIC and Extended BASIC interpreters on simple programs. Performance is about 1/10 of native speeds as expected.

The second is a bit more ambitious and has direct application to much of what I have been doing in this thread, writing compilers.

In the early days, I was obsessed with a programming language called PL/M. It was modeled after PL/I, a programming language from the IBM mainframe, but was intended to be used to program microprocessors. It was somewhat legendary because Gary Kildall reportedly used it to write his CP/M operating system.

I never got to program in PL/M. As far as I know, it was never available to actually run on CP/M, but it had to be used either on an expensive development system from Intel or some mini and mainframe computers.

PL/M is a fairly simple language. It has only two data types: byte and address. Address was the term for two-byte quantities. It was actually more of a high-level assembler than C, its minicomputer competitor. Arithmetic is always treated as unsigned. Adding or subtracting byte values result in a byte value. Mixing a byte and an address promotes the byte by zero extension; storing an address into a byte lops off the upper byte.

A great summary is here if you want to see more: https://www.autometer.de/unix4fun/z80pack/doc_cpm_plm.html

So with the technology I already have, I put together a compiler for a subset of PL/M. I targeted the 6800 processor because it is something I know well.

P$Test: do;

    declare (A0, A1, A2) address, (B0, B1, B2) byte;

    B0 = 1;
    B1 = 2 + B0;
    B1 = B1 + 1;

    B2 = B0 + B1;

    A0 = B0;

    B2 = 3 + A0;

    A0 = B0 + 1;

    A0 = B0 + 256;

    A0 = B0 + 257;

    B2 = A0;

    A1 = A0 + B1;

    A2 = B2 + A1;

    A0 = 1;
    A1 = 2 + A1;

    A2 = A0 + A1;

    A2 = A0 + 0;

    A2 = A0 + 1;

    A2 = A0 + 256;

    A2 = A0 + 257;

    A2 = 256 + B0;

    A2 = 257 + B0;

    A2 = 1 + 2;

    goto 0AD03h;

end P$Test;

compiles to

 					  00001	* P$Test: do;
 					  00002
 					  00003	         lib    ptest.dat
.					  00004
.0000				  00005	_PTEST.A0 rmb   2
.					  00006
.0002				  00007	_PTEST.A1 rmb   2
.					  00008
.0004				  00009	_PTEST.A2 rmb   2
.					  00010
.0006				  00011	_PTEST.B0 rmb   1
.					  00012
.0007				  00013	_PTEST.B1 rmb   1
.					  00014
.0008				  00015	_PTEST.B2 rmb   1
 					  00016
 0009				  00017	_PTEST
 					  00018	* 
 					  00019	*     declare (A0, A1, A2) address, (B0, B1, B2) byte;
 					  00020	* 
 					  00021	*     B0 = 1;
 0009 C6 01	      [2] 00022	         ldab   #1
 000B D7 06	      [4] 00023	         stab   _PTEST.B0
 					  00024	*     B1 = 2 + B0;
 000D D6 06	      [3] 00025	         ldab   _PTEST.B0
 000F CB 02	      [2] 00026	         addb   #2
 0011 D7 07	      [4] 00027	         stab   _PTEST.B1
 					  00028	*     B1 = B1 + 1;
 0013 D6 07	      [3] 00029	         ldab   _PTEST.B1
 0015 5C	      [2] 00030	         incb
 0016 D7 07	      [4] 00031	         stab   _PTEST.B1
 					  00032	* 
 					  00033	*     B2 = B0 + B1;
 0018 D6 06	      [3] 00034	         ldab   _PTEST.B0
 001A DB 07	      [3] 00035	         addb   _PTEST.B1
 001C D7 08	      [4] 00036	         stab   _PTEST.B2
 					  00037	* 
 					  00038	*     A0 = B0;
 001E D6 06	      [3] 00039	         ldab   _PTEST.B0
 0020 4F	      [2] 00040	         clra
 0021 97 00	      [4] 00041	         staa   _PTEST.A0
 0023 D7 01	      [4] 00042	         stab   _PTEST.A0+1
 					  00043	* 
 					  00044	*     B2 = 3 + A0;
 0025 D6 01	      [3] 00045	         ldab   _PTEST.A0+1
 0027 96 00	      [3] 00046	         ldaa   _PTEST.A0
 0029 CB 03	      [2] 00047	         addb   #3
 002B 89 00	      [2] 00048	         adca   #0
 002D D7 08	      [4] 00049	         stab   _PTEST.B2
 					  00050	* 
 					  00051	*     A0 = B0 + 1;
 002F D6 06	      [3] 00052	         ldab   _PTEST.B0
 0031 5C	      [2] 00053	         incb
 0032 4F	      [2] 00054	         clra
 0033 97 00	      [4] 00055	         staa   _PTEST.A0
 0035 D7 01	      [4] 00056	         stab   _PTEST.A0+1
 					  00057	* 
 					  00058	*     A0 = B0 + 256;
 0037 D6 06	      [3] 00059	         ldab   _PTEST.B0
 0039 86 01	      [2] 00060	         ldaa   #1
 003B 97 00	      [4] 00061	         staa   _PTEST.A0
 003D D7 01	      [4] 00062	         stab   _PTEST.A0+1
 					  00063	* 
 					  00064	*     A0 = B0 + 257;
 003F D6 06	      [3] 00065	         ldab   _PTEST.B0
 0041 4F	      [2] 00066	         clra
 0042 CB 01	      [2] 00067	         addb   #1
 0044 89 01	      [2] 00068	         adca   #1
 0046 97 00	      [4] 00069	         staa   _PTEST.A0
 0048 D7 01	      [4] 00070	         stab   _PTEST.A0+1
 					  00071	* 
 					  00072	*     B2 = A0;
 004A D6 01	      [3] 00073	         ldab   _PTEST.A0+1
 004C D7 08	      [4] 00074	         stab   _PTEST.B2
 					  00075	* 
 					  00076	*     A1 = A0 + B1;
 004E D6 01	      [3] 00077	         ldab   _PTEST.A0+1
 0050 96 00	      [3] 00078	         ldaa   _PTEST.A0
 0052 DB 07	      [3] 00079	         addb   _PTEST.B1
 0054 89 00	      [2] 00080	         adca   #0
 0056 97 02	      [4] 00081	         staa   _PTEST.A1
 0058 D7 03	      [4] 00082	         stab   _PTEST.A1+1
 					  00083	* 
 					  00084	*     A2 = B2 + A1;
 005A D6 08	      [3] 00085	         ldab   _PTEST.B2
 005C 4F	      [2] 00086	         clra
 005D DB 03	      [3] 00087	         addb   _PTEST.A1+1
 005F 99 02	      [3] 00088	         adca   _PTEST.A1
 0061 97 04	      [4] 00089	         staa   _PTEST.A2
 0063 D7 05	      [4] 00090	         stab   _PTEST.A2+1
 					  00091	* 
 					  00092	*     A0 = 1;
 0065 CE 0001     [3] 00093	         ldx    #1
 0068 DF 00	      [5] 00094	         stx    _PTEST.A0
 					  00095	*     A1 = 2 + A1;
 006A D6 03	      [3] 00096	         ldab   _PTEST.A1+1
 006C 96 02	      [3] 00097	         ldaa   _PTEST.A1
 006E CB 02	      [2] 00098	         addb   #2
 0070 89 00	      [2] 00099	         adca   #0
 0072 97 02	      [4] 00100	         staa   _PTEST.A1
 0074 D7 03	      [4] 00101	         stab   _PTEST.A1+1
 					  00102	* 
 					  00103	*     A2 = A0 + A1;
 0076 D6 01	      [3] 00104	         ldab   _PTEST.A0+1
 0078 96 00	      [3] 00105	         ldaa   _PTEST.A0
 007A DB 03	      [3] 00106	         addb   _PTEST.A1+1
 007C 99 02	      [3] 00107	         adca   _PTEST.A1
 007E 97 04	      [4] 00108	         staa   _PTEST.A2
 0080 D7 05	      [4] 00109	         stab   _PTEST.A2+1
 					  00110	* 
 					  00111	*     A2 = A0 + 0;
 0082 DE 00	      [4] 00112	         ldx    _PTEST.A0
 0084 DF 04	      [5] 00113	         stx    _PTEST.A2
 					  00114	* 
 					  00115	*     A2 = A0 + 1;
 0086 D6 01	      [3] 00116	         ldab   _PTEST.A0+1
 0088 96 00	      [3] 00117	         ldaa   _PTEST.A0
 008A CB 01	      [2] 00118	         addb   #1
 008C 89 00	      [2] 00119	         adca   #0
 008E 97 04	      [4] 00120	         staa   _PTEST.A2
 0090 D7 05	      [4] 00121	         stab   _PTEST.A2+1
 					  00122	* 
 					  00123	*     A2 = A0 + 256;
 0092 D6 01	      [3] 00124	         ldab   _PTEST.A0+1
 0094 96 00	      [3] 00125	         ldaa   _PTEST.A0
 0096 4C	      [2] 00126	         inca
 0097 97 04	      [4] 00127	         staa   _PTEST.A2
 0099 D7 05	      [4] 00128	         stab   _PTEST.A2+1
 					  00129	* 
 					  00130	*     A2 = A0 + 257;
 009B D6 01	      [3] 00131	         ldab   _PTEST.A0+1
 009D 96 00	      [3] 00132	         ldaa   _PTEST.A0
 009F CB 01	      [2] 00133	         addb   #1
 00A1 89 01	      [2] 00134	         adca   #1
 00A3 97 04	      [4] 00135	         staa   _PTEST.A2
 00A5 D7 05	      [4] 00136	         stab   _PTEST.A2+1
 					  00137	* 
 					  00138	*     A2 = 256 + B0;
 00A7 D6 06	      [3] 00139	         ldab   _PTEST.B0
 00A9 86 01	      [2] 00140	         ldaa   #1
 00AB 97 04	      [4] 00141	         staa   _PTEST.A2
 00AD D7 05	      [4] 00142	         stab   _PTEST.A2+1
 					  00143	* 
 					  00144	*     A2 = 257 + B0;
 00AF D6 06	      [3] 00145	         ldab   _PTEST.B0
 00B1 4F	      [2] 00146	         clra
 00B2 CB 01	      [2] 00147	         addb   #1
 00B4 89 01	      [2] 00148	         adca   #1
 00B6 97 04	      [4] 00149	         staa   _PTEST.A2
 00B8 D7 05	      [4] 00150	         stab   _PTEST.A2+1
 					  00151	* 
 					  00152	*     A2 = 1 + 2;
 00BA CE 0003     [3] 00153	         ldx    #3
 00BD DF 04	      [5] 00154	         stx    _PTEST.A2
 					  00155	* 
 					  00156	*     goto 0AD03h;
 					  00157
 00BF 7E AD03     [3] 00158	         jmp    $AD03
 					  00159	* 
 					  00160	* end P$Test;

But wait, there’s more.

PL/M was known for being a fairly good optimizing compiler for its time. My simplistic approach of generating code as I parse will not do.

Something I had always intended to do was to build a tree while parsing to represent the program and use that to generate object code. Separating parsing and code generation results in two moderately complicated pieces instead of a single monstrously complex beast. More importantly, it opens the door to insert an optimizer in between.

The PL/M compiler is simple enough to be a good test platform for building this.

What I have right now is a single-line compiler to convert an assignment statement into a tree, then generate assembly language from that.

 					  00012	**A0 = (B1 + 1) + A1;
 					  00013
 					  00014	*   *  0 := v A0 -> 1
 					  00015	*   *  1 L n 4
 					  00016
 					  00017	*      *  4 L n 2 -> 5
 					  00018
 					  00019	*         *  2 L v B1 -> 3
 					  00020	*         *  3 + c 1
 					  00021
 					  00022	*      *  5 + v A1
 					  00023
 					  00024
 					  00025	*  2 L v B1 -> 3
 					  00026	*  3 + c 1
 0010 D6 0D	      [3] 00027	         ldab   B1
 0012 5C	      [2] 00028	         incb
 					  00029	*  4 L n 2 -> 5
 					  00030	*  5 + v A1
 0013 4F	      [2] 00031	         clra
 0014 DB 07	      [3] 00032	         addb   A1+1
 0016 99 06	      [3] 00033	         adca   A1
 					  00034	*  1 L n 4
 					  00035	*  0 := v A0 -> 1
 0018 D7 05	      [4] 00036	         stab   A0+1
 001A 97 04	      [4] 00037	         staa   A0

The generated comment lines show the original line of source code, the parse tree and the applicable nodes as code is generated.

Now the fun begins: playing with optimization of the tree.

How applicable this will be to compiling Python remains to be seen, but this is something I had been wanting to do for quite a long time.


#217

From the previous post,

That was a test for the reader and nobody took the bait.

You may have noticed that the variable A0 was not allocated as “_A0” but as “_PTEST.A0”

For a simplistic language, PL/M has a sophisticated scoping mechanism. Procedures and functions may be defined. They may be nested, similar to what is allowed in Pascal. Furthermore, like in C, variables may be declared within a procedure or some nested DO blocks.

This is a valid PL/M program demonstrating nested scope:

P$Test: do;

    declare A address;  /* first A */

    /* point 1 */

    Proc: procedure;

        declare A address;  /* second A */

        /* point 2 */

        Block: do;

            declare A address;  /* third A */

            /* point 3 */

        end Block;

        /* point 4 */

    end Proc;

    /* point 5 */

    Another: do;

        declare A address;  /* fourth A */

        /* point 6 */

    end Another;

    /* point 7 */

end P$Test;

At points 1, 5 and 7, a reference to A gets the first A.

At points 2 and 4, a reference to A gets the second A.

At point 3, a reference to A gets the third A.

At point 6, a reference to A gets the fourth A.

The first A is allocated as _PTEST.A

The second as _PTEST.PROC.A

The third as _PTEST.PROC.BLOCK.A

And the fourth as _PTEST.ANOTHER.A

The compiler keeps track of block nesting to generate the name prefixes.

Scoping in Python is somewhat similiar to PL/M.

In this prior post,
[Project Log] Python on the 6502/C64, 8080, 6800, 6809 and AVR

an unexpected behavior of variable scoping in Python was discussed.

Building a parse tree is one way to uncover such forward “declarations” without having to make more than one pass of the source code.

Work continues on compiling Python, but do not be surprised to see a PL/M compiler for the 6502 appear as these techniques are developed further.


#218

One way to subtract is to add the negative of a number.

PL/M has two special forms of add and subtract which takes into account the processor’s carry flag. That’s right, an add with carry or subtract with borrow in a “high-level” language.

Unfortunately, this is foiling the ability to save some code in the compiler by sharing the generation of addition and subtraction instructions. At least when generating code for the 6800.

ADC with the carry flag set adds an additional one; SBC subtracts an additional one. If a number is negated and added instead of subtracting, any borrow is not correctly handled, carrying (adding another 1) instead of borrowing (subtracting another 1.) If only the 6800 had an instruction to flip the carry flag like the 8080/Z80…

The 6502 may have gotten SBC right for this trick because it subtracts the inverse of the carry flag. The compiler can complement instead of negating the number for adding. That may be because the 6502 saved some transistors by complementing and adding instead of subtracting.

Let’s work through an example.

Suppose we want to subtract 1 from 0.

The low byte of 0 is %00000000; the high byte of 0 is %00000000.
The low byte of 1 is %00000001; the high byte of 1 is %00000000.
The low byte of -1 is %11111111; the high byte of -1 is %11111111.

Complementing is flipping all of the bits of a number (one’s complement)
Negating is flipping the bits then adding 1 (two’s complement)

The low byte of the one’s complement of 1 is %11111110; the high byte is %11111111.

If we subtract on the 6502,

 0000 38	      [2] 00001	 sec		; No borrow
 					  00002
 0001 AD 0022     [4] 00003	 lda Lo		; A contains %00000000
 0004 E9 01	      [2] 00004	 sbc #1		; A contains %11111111
 0006 8D 0022     [4] 00005	 sta Lo		; "carry" flag is clear
 					  00006
 0009 AD 0023     [4] 00007	 lda Hi		; A contains %00000000
 000C E9 00	      [2] 00008	 sbc #0		; A contains %11111111
 000E 8D 0023     [4] 00009	 sta Hi		; "carry" flag is clear

Note that the “carry” flag is clear signifying a borrow.

Now if we subtract by adding,

 0011 38	      [2] 00011	 sec		; No borrow
 					  00012
 0012 AD 0022     [4] 00013	 lda Lo		; A contains %00000000
 0015 69 FE	      [2] 00014	 adc #%11111110	; A contains %11111111
 0017 8D 0022     [4] 00015	 sta Lo		; "carry" flag is clear
 					  00016
 001A AD 0023     [4] 00017	 lda Hi		; A contains %00000000
 001D 69 FF	      [2] 00018	 adc #%11111111	; A contains %11111111
 001F 8D 0023     [4] 00019	 sta Hi		; "carry" flag is clear

The results are identical.

If the initial sec is omitted, both code snippits accept a borrow from previous calculation.

However on the 6800,

 0000 0C	      [2] 00001	         clc			; No borrow
 					  00002
 0001 B6 0022	  [4] 00003	         ldaa   Lo		; A contains %00000000
 0004 82 01	      [2] 00004	         sbca   #1		; A contains %11111111
 0006 B7 0022     [5] 00005	         staa   Lo		; "carry" flag is set
 					  00006
 0009 B6 0023     [4] 00007	         ldaa   Hi		; A contains %00000000
 000C 82 00	      [2] 00008	         sbca   #0		; A contains %11111111
 000E B7 0023     [5] 00009	         staa   Hi		; "carry" flag is set

And by adding,

 0011 0C	      [2] 00011	         clc			; No borrow
 					  00012
 0012 B6 0022     [4] 00013	         ldaa   Lo		; A contains %00000000
 0015 89 FE	      [2] 00014	         adca   #%11111110	; A contains %11111110
 0017 B7 0022     [5] 00015	         staa   Lo		; "carry" flag is clear
 					  00016
 001A B6 0023     [4] 00017	         ldaa   Hi		; A contains %00000000
 001D 89 FF	      [2] 00018	         adca   #%11111111	; A contains %11111111
 001F B7 0023     [5] 00019	         staa   Hi		; "carry" flag is clear

If only there was an instruction to complement the carry flag, we can adopt the 6502 borrow convention and make the code work.


#219

Still trying to climb out of this rabbit hole…

Web searching turned up some very interesting PL/M stuff.

  • A PL/M compiler for 6800 FLEX
  • A PL/M compiler for the 1802
  • A PL/M-80 compiler for MS-DOS
  • PL/M source code for the BDOS, CCP and the LOAD transient command from an early version of CP/M

The last discovery is particularly interesting. None of these use the PLUS and MINUS operators. These are the add with carry and subtract with borrow operations described previously.

Probably because even the PL/M-80 manual cautions about their use:

Trying to support MINUS definitely results in some fairly inefficient code on the 6800:

 					  00016	**A0 = (A0 + A1) MINUS (A2 + A3);
 					  00017
 					  00018	*   *  0 := v A0 -> 1
 					  00019	*   *  1 L n 4
 					  00020
 					  00021	*      *  4 L n 2 -> 5
 					  00022
 					  00023	*         *  2 L v A0 -> 3
 					  00024	*         *  3 + v A1
 					  00025
 					  00026	*      *  5 -b n 6
 					  00027
 					  00028	*         *  6 L v A2 -> 7
 					  00029	*         *  7 + v A3
 					  00030
 					  00031
 					  00032
 					  00033	*  2 L v A0 -> 3
 					  00034	*  3 + v A1
 0018 D6 05	      [3] 00035	         ldab   A0+1
 001A 96 04	      [3] 00036	         ldaa   A0
 001C DB 07	      [3] 00037	         addb   A1+1
 001E 99 06	      [3] 00038	         adca   A1
 					  00039	*  4 L n 2 -> 5
 					  00040	*  5 -b n 6
 0020 97 10	      [4] 00041	         staa   Tp00_
 0022 D7 11	      [4] 00042	         stab   Tp00_+1
 					  00043	*  6 L v A2 -> 7
 					  00044	*  7 + v A3
 0024 D6 09	      [3] 00045	         ldab   A2+1
 0026 96 08	      [3] 00046	         ldaa   A2
 0028 DB 0B	      [3] 00047	         addb   A3+1
 002A 99 0A	      [3] 00048	         adca   A3
 002C 97 12	      [4] 00049	         staa   Tp01_
 002E D7 13	      [4] 00050	         stab   Tp01_+1
 0030 96 10	      [3] 00051	         ldaa   Tp00_
 0032 D6 11	      [3] 00052	         ldab   Tp00_+1
 0034 D2 13	      [3] 00053	         sbcb   Tp01_+1
 0036 92 12	      [3] 00054	         sbca   Tp01_
 					  00055	*  1 L n 4
 					  00056	*  0 := v A0 -> 1
 0038 D7 05	      [4] 00057	         stab   A0+1
 003A 97 04	      [4] 00058	         staa   A0

Not being able to subtract by adding the complement necessitates using a second set of temporary variables on a machine with limited registers.


#220

Code generation for addition and subtraction is complete. It has been refactored and is in good shape.

Working on code generation for and, or and xor is a joy in comparison because they

  • are commutative
  • do not carry or borrow from the low byte to the high one

#221

Part of the fun in code optimization is finding and exploiting opportunities in the way boolean logic works:

 					  00016	**A0 = A1 and 0FF00h;
 					  00017
 					  00018	*   *  0 := v A0 -> 1
 					  00019	*   *  1 L r 2
 					  00020
 					  00021	*      *  2 L v A1 -> 3
 					  00022	*      *  3 a c -256
 					  00023
 					  00024
 					  00025	*  2 L v A1 -> 3
 					  00026	*  3 a c -256
 0018 5F	      [2] 00027	         clrb
 0019 96 06	      [3] 00028	         ldaa   A1
 					  00029	*  1 L r 2
 					  00030	*  0 := v A0 -> 1
 001B D7 05	      [4] 00031	         stab   A0+1
 001D 97 04	      [4] 00032	         staa   A0

Edit: more fun with Boolean logic…

 					  00016	**A0 = A1 or 0FFh;
 					  00017
 					  00018	*   *  0 := v A0 -> 1
 					  00019	*   *  1 L r 2
 					  00020
 					  00021	*      *  2 L v A1 -> 3
 					  00022	*      *  3 o c 255
 					  00023
 					  00024
 					  00025	*  2 L v A1 -> 3
 					  00026	*  3 o c 255
 0018 C6 FF	      [2] 00027	         ldab   #$FF
 001A 96 06	      [3] 00028	         ldaa   A1
 					  00029	*  1 L r 2
 					  00030	*  0 := v A0 -> 1
 001C D7 05	      [4] 00031	         stab   A0+1
 001E 97 04	      [4] 00032	         staa   A0

and

 					  00016	**A0 = A1 + A2 xor 0FFh;
 					  00017
 					  00018	*   *  0 := v A0 -> 1
 					  00019	*   *  1 L r 4
 					  00020
 					  00021	*      *  4 L r 2 -> 5
 					  00022
 					  00023	*         *  2 L v A1 -> 3
 					  00024	*         *  3 + v A2
 					  00025
 					  00026	*      *  5 x c 255
 					  00027
 					  00028
 					  00029	*  2 L v A1 -> 3
 					  00030	*  3 + v A2
 0018 D6 07	      [3] 00031	         ldab   A1+1
 001A 96 06	      [3] 00032	         ldaa   A1
 001C DB 09	      [3] 00033	         addb   A2+1
 001E 99 08	      [3] 00034	         adca   A2
 					  00035	*  4 L r 2 -> 5
 					  00036	*  5 x c 255
 0020 53	      [2] 00037	         comb
 					  00038	*  1 L r 4
 					  00039	*  0 := v A0 -> 1
 0021 D7 05	      [4] 00040	         stab   A0+1
 0023 97 04	      [4] 00041	         staa   A0