[Project Log] Python on the 6502/C64, 8080, 6800, 6809 and AVR

Heap managers are done for the 6800 and 6809.

Still working on the one for the AVR.

The Z80 assembler is finally working. It still needs work to detect and flag invalid input. The Z80 simulator needs work to disassemble the “new” Z80 instructions.

A serious issue has come up with the AVR which makes it impossible to use the official assembler and requires some ugly hacks in mine.

A string is currently defined to look like this in memory:

+---------------+---------------+---------------+----------------------------
|		|		|		|
|      $81	|        length of string	|        characters of the string
|		|		|		|
+---------------+---------------+---------------+----------------------------

The problem is when I want to put constant strings into flash memory. The official AVR assembler insists on putting things on an even address. That makes sense for machine instructions, but is quite limiting for compact data structures.

1 Like

There turns out to be a solution to the problem - put everything on a single line.

Instead of trying to define the object type and the string contents separately:

 			  00017	.macro	msg
 			  00018
 			  00019	.db	strlen(@0)&$FF	; the length
 			  00020	.db	strlen(@0)>>8
 			  00021	.db	@0
 			  00022
 			  00023	.endm

 :  
 :  
 :
 
 000486			  00295	S_True:
 000486 8100		  00296		.db	TYPE_STRING
 			  00297		msg	"True"
+
+000487 0400			.db	strlen("True")&$FF	; the length
+000488 0000			.db	strlen("True")>>8
+000489 54727565		.db	"True"

Create a macro to lay out a string object:

 			  00011	.macro	string
 			  00012
 			  00013	.db	TYPE_STRING,strlen(@0)&$FF,strlen(@0)>>8,@0
 			  00014
 			  00015	.endm

 :  
 :  
 :

 000482			  00292	S_None:
 			  00293		string	"None"
+
+000482 8104004E6F6E		.db	TYPE_STRING,strlen("None")&$FF,strlen("None")>>8,"None"
+000485 6500
1 Like

Currently giving the 6502 version some love in preparation for the next public demo release for the retrocomputing meeting on November 10.

Rewrote the parsing of if/else to handle if/elif/else.

Quite a bit has been added since the last release in July. The current feature list is:

  • Dynamic typing
  • Types: integer, string, True, False, function (partial)
  • Automatic memory management (using reference counting)
  • Variable precision integers
  • Integer operators +, -, *, //, %, &, ^, |
  • Variable length strings
  • String operators +, *
  • Relational operators ==, !=, <, <=, >, >=
  • if else elif
  • while else break continue
  • print function, including sep and end keyword arguments
  • input function
  • hardware access functions: peek, poke
  • randint function
  • int function
  • hex, oct and bin functions

Now working on right shift. Due to variable precision integers, this is not trivial.

2 Likes

A couple of other things worth a mention:

I have been reading this book:

https://www.brucesmith.info/raspberry-pi-assembly-language-raspbian/

I have remembered previous discussions about WebAssembly and think it deserves another look:

2 Likes

I regret that I have to raise the white flag for making another release by the retrocomputing meeting this weekend.

The work on right shift is taking longer than expected. There is no way I can get left shift working by Saturday. There is even not enough time for me to do the regression testing for another release.

Not much on this project has been easy.

Consider the following code:

""" locals.py
examples of local variables"""

A = 'This is the text message from main'

def test1():
    print('In test1.')
    print(A)
    print('Leaving test1.')

def test2():
    print('In test2.')
    A = "This is test2's text message."
    print(A)
    print('Leaving test2.')

def test3():
    print('In test3.')
    print(A)
    A = "This is test3's text message."
    print('Leaving test3.')

test1()
print('Back in main')
print(A)
test2()
print('Back in main')
print(A)
test3()

Running it yields:

In test1.
This is the text message from main
Leaving test1.
Back in main
This is the text message from main
In test2.
This is test2's text message.
Leaving test2.
Back in main
This is the text message from main
In test3.
Traceback (most recent call last):
  File "C:/Users/Bill/AppData/Local/Programs/Python/Python35-32/locals.py", line 29, in <module>
    test3()
  File "C:/Users/Bill/AppData/Local/Programs/Python/Python35-32/locals.py", line 19, in test3
    print(A)
UnboundLocalError: local variable 'A' referenced before assignment

The error indicates that assigning to a variable in test3 blocks access to a global variable of the same name anywhere in the function, even before the assignment. What this means is that functions cannot be compiled in a single pass.

Notice anything?

Project scope changed from “toy Python compiler” to “Fierce Python compiler for all the cool kid micros?”

Or maybe the simpler: Another one of my projects that is almost 2 years old?

1 Like

Do you mean compile Python code into FORTH or compile FORTH programs?

I have written FORTH interpreters for several processors and had thought about implementing a metacompiler.

Fifth-generation programming language paradigm looks interesting.

It is still a toy because it cannot handle but the simplest programs.

  • No lists, tuples or dictionaries
  • No indexing or slicing
  • No iterators
  • No floating point
  • Cannot define functions
  • No exception handling
  • Cannot import libraries
1 Like

Implementing functions will require restructuring the compiler to make two passes while parsing a function or creating a parse tree and using that to generate code. That will have to be done at a later date.

In the meantime, structured exception handling can be implemented with one simplification, exceptions will be numbers instead of classes. This means that custom exceptions cannot be defined, but exceptions from the runtime library can be intercepted.

This is the first of possibly several posts on the design and implementation of an exception handling subsystem for Python.

For starters, please read the Python error handling documentation:

https://docs.python.org/3/tutorial/errors.html

My pseudocode is:

try:
	push context
	point handler vector to before first except clause
	execute code suite
	once break is encountered:
		pop context
		call finally clause
		do the break
	once continue is encountered:
		pop context
		call finally clause
		do the continue
	once return is encountered:
		pop context
		call finally clause
		do the return
	once raise is encountered:
		pop context
		call finally clause
		do the raise
	pop context
	goto else clause

before first except:
	pop context
	fall through to first except clause

except <exception>:
	if <exception>:
		execute code suite
		once return is encountered:
			call finally clause
			do the return
		once raise is encountered:
			call finally clause
			do the raise
		goto finally clause
	fall through to next except clause

except <no exception specified>:
	execute code suite
	once return is encountered:
		call finally clause
		do the return
	once raise is encountered:
		call finally clause
		do the raise
	call finally clause
	goto end_try

after last except clause:
	call finally clause
	do a raise

else:
	execute code suite
	once return is encountered:
		call finally clause
		do the return
	once raise is encountered:
		call finally clause
		do the raise
	call finally clause
	goto end_try

finally:
	execute code suite
	return

end_try:

Did I miss anything?

I just found and fixed a rather obscure bug in my 6809 cross assembler (and also the single-line assembler in the emulator.)

Some instructions of the 6809 processor have two bytes of opcode instead of one. The indexed mode instructions use a postbyte to indicate the specific addressing mode along with up to two additional bytes for an optional offset or address.

The problem is with the program counter relative versions of these instructions. The stored offset is the number of bytes from the first byte of the following instruction to the target address.

I have a routine to parse the operand and generate the postbyte and any additional bytes. For most addressing modes, it does not matter whether the base instruction is one or two bytes. But that is an important detail for the program counter relative form.

 0000 10A3 8D 0000 (0005)    [12] 00001	         cmpd   C34A,PCR
 								  00002
 0005							  00003	C34A     rmb    1

To my surprise, this bug generated a lively discussion on the FLEX User Group e-mail list.

As my assembler does not allow forcing the size, I came up with source to try all of the possibilities:

 org 0

Before

 cmpa Before,PCR
 cmpa After,PCR
 cmpa Last,PCR

 cmpd Last,PCR
 cmpd Before,PCR
 cmpd After,PCR

After rmb 1

 org $200

Later
 cmpa Before,PCR
 cmpa Later,PCR
 cmpa Last,PCR

 cmpd Before,PCR
 cmpd Later,PCR
 cmpd Last,PCR

Last

 end

The TSC FLEX assembler does this:

    1   0000                       org    0
    2                      
    3   0000               Before
    4                      
    5   0000 A1   8C FD            cmpa   Before,PCR
    6   0003 A1   8D 0012          cmpa   After,PCR
    7   0007 A1   8D 020E          cmpa   Last,PCR
    8                      
    9   000B 10A3 8D 0209          cmpd   Last,PCR
   10   0010 10A3 8C EC            cmpd   Before,PCR
   11   0014 10A3 8D 0000          cmpd   After,PCR
   12                      
   13   0019               After   rmb    1
   14                      
   15   0200                       org    $200
   16                      
   17   0200               Later
   18   0200 A1   8D FDFC          cmpa   Before,PCR
   19   0204 A1   8C F9            cmpa   Later,PCR
   20   0207 A1   8D 000E          cmpa   Last,PCR
   21                      
   22   020B 10A3 8D FDF0          cmpd   Before,PCR
   23   0210 10A3 8C EC            cmpd   Later,PCR
   24   0214 10A3 8D 0000          cmpd   Last,PCR
   25                      
   26   0219               Last
   27                      

And my assembler does this:

 0000							  00001	         org    0
 								  00002
 0000							  00003	Before
 								  00004
 0000 A1 8C FD (0000)	      [5] 00005	         cmpa   Before,PCR
 0003 A1 8D 0012 (0019)	      [9] 00006	         cmpa   After,PCR
 0007 A1 8D 020E (0219)	      [9] 00007	         cmpa   Last,PCR
 								  00008
 000B 10A3 8D 0209 (0219)    [12] 00009	         cmpd   Last,PCR
 0010 10A3 8C EC (0000)	      [8] 00010	         cmpd   Before,PCR
 0014 10A3 8D 0000 (0019)    [12] 00011	         cmpd   After,PCR
 								  00012
 0019							  00013	After    rmb    1
 								  00014
 0200							  00015	         org    $200
 								  00016
 0200							  00017	Later
 0200 A1 8D FDFC (0000)	      [9] 00018	         cmpa   Before,PCR
 0204 A1 8C F9 (0200)	      [5] 00019	         cmpa   Later,PCR
 0207 A1 8D 000E (0219)	      [9] 00020	         cmpa   Last,PCR
 								  00021
 020B 10A3 8D FDF0 (0000)    [12] 00022	         cmpd   Before,PCR
 0210 10A3 8C EC (0200)	      [8] 00023	         cmpd   Later,PCR
 0214 10A3 8D 0000 (0219)    [12] 00024	         cmpd   Last,PCR
 								  00025
 0219							  00026	Last
 								  00027
 								  00028	         end

TSC and I agree.

1 Like

After a couple of intense days of hacking, this code compiles and runs:

Choice = ' '
while Choice != '':
	try:
		print()
		print('Your choices are:')
		print('  0 - divide by zero')
		print()
		Choice = input('So which is it? ')

		if Choice == '0':
			A = 0
			print(6//A)
		elif Choice != '':
			print('I do not understand.')
	except:
		print('Caught ZeroDivisionError')

There is still a long way to go, but that was a major step.

5 Likes

A peek under the hood…

This is a skeleton program:

try:
	# main code suite
	pass
except:
	# except code suite
	pass
else:
	# else code suite
	pass
finally:
	# finally code suite
	pass

This is the raw assembly listing:

 						  00177	; 00001	try:
 023B 20 0694	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		# main code suite
 						  00184	; 00003		pass
 						  00185	; 00004	except:
 0246 20 06AE	      [6] 00186		jsr	PopExceptionContext
 0249 4C 0255	      [3] 00187		jmp	L00001
 024C					  00188	L00000
 024C 20 06AE	      [6] 00189		jsr	PopExceptionContext
 						  00190	; 00005		# except code suite
 						  00191	; 00006		pass
 						  00192	; 00007	else:
 024F 20 025B	      [6] 00193		jsr	L00002
 0252 4C 025C	      [3] 00194		jmp	L00003
 0255					  00195	L00001
 						  00196	; 00008		# else code suite
 						  00197	; 00009		pass
 						  00198	; 00010	finally:
 0255 20 025B	      [6] 00199		jsr	L00002
 0258 4C 025C	      [3] 00200		jmp	L00003
 025B					  00201	L00002
 						  00202	; 00011		# finally code suite
 						  00203	; 00012		pass
 025B 60		      [6] 00204		rts
 025C					  00205	L00003

This is the listing rearranged slightly to put the lines of Python source as comments near the generated snippets:

 						  00177	; 00001	try:
 023B 20 0694	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		# main code suite
 						  00184	; 00003		pass
 0246 20 06AE	      [6] 00186		jsr	PopExceptionContext
 0249 4C 0255	      [3] 00187		jmp	L00001
 						  00185	; 00004	except:
 024C					  00188	L00000
 024C 20 06AE	      [6] 00189		jsr	PopExceptionContext
 						  00190	; 00005		# except code suite
 						  00191	; 00006		pass
 024F 20 025B	      [6] 00193		jsr	L00002
 0252 4C 025C	      [3] 00194		jmp	L00003
 						  00192	; 00007	else:
 0255					  00195	L00001
 						  00196	; 00008		# else code suite
 						  00197	; 00009		pass
 0255 20 025B	      [6] 00199		jsr	L00002
 0258 4C 025C	      [3] 00200		jmp	L00003
 						  00198	; 00010	finally:
 025B					  00201	L00002
 						  00202	; 00011		# finally code suite
 						  00203	; 00012		pass
 025B 60		      [6] 00204		rts
 025C					  00205	L00003

The compiler is a single-pass recursive descent parser doing syntax-directed code generation. Things like an empty else or finally part ought to be optimized out, but that is difficult to do without multiple passes, building a parse tree and using that to generate code or implementing a post-compile code optimizer.

Last time, the code generated for try/except/else/finally was presented. Here it is again, but with some minor edits of the Python source code to reduce the number of lines and also with Python code comments moved in each assembly language source file for better correlation to the relevant machine code.

try:
	pass	# main code suite
except:
	pass	# except code suite
else:
	pass	# else code suite
finally:
	pass	# finally code suite
 						  00177	; 00001	try:
 023B 20 0691	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06AB	      [6] 00184		jsr	PopExceptionContext
 0249 4C 0255	      [3] 00185		jmp	L00001
 						  00186	; 00003	except:
 024C					  00187	L00000
 024C 20 06AB	      [6] 00188		jsr	PopExceptionContext
 						  00189	; 00004		pass	# except code suite
 024F 20 025B	      [6] 00190		jsr	L00002
 0252 4C 025C	      [3] 00191		jmp	L00003
 						  00192	; 00005	else:
 0255					  00193	L00001
 						  00194	; 00006		pass	# else code suite
 0255 20 025B	      [6] 00195		jsr	L00002
 0258 4C 025C	      [3] 00196		jmp	L00003
 						  00197	; 00007	finally:
 025B					  00198	L00002
 						  00199	; 00008		pass	# finally code suite
 025B 60		      [6] 00200		rts
 025C					  00201	L00003

Moving the call of the finally code to the bottom allows saving an instance of the instruction for each except clause

 						  00177	; 00001	try:
 023B 20 068E	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06A8	      [6] 00184		jsr	PopExceptionContext
 0249 4C 0252	      [3] 00185		jmp	L00001
 						  00186	; 00003	except:
 024C					  00187	L00000
 024C 20 06A8	      [6] 00188		jsr	PopExceptionContext
 						  00189	; 00004		pass	# except code suite
 024F 4C 0256	      [3] 00190		jmp	L00003
 						  00191	; 00005	else:
 0252					  00192	L00001
 						  00193	; 00006		pass	# else code suite
 0252 4C 0256	      [3] 00194		jmp	L00003
 						  00195	; 00007	finally:
 0255					  00196	L00002
 						  00197	; 00008		pass	# finally code suite
 0255 60		      [6] 00198		rts
 0256					  00199	L00003
 0256 20 0255	      [6] 00200		jsr	L00002

More importantly, this seemingly trivial transformation creates significant opportunities for a dumb single-pass compiler with but one lexical token of lookahead to eliminate unneeded elements.

To begin with, consider the case in which the else clause is not present,

 						  00177	; 00001	try:
 023B 20 06A9	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06C3	      [6] 00184		jsr	PopExceptionContext
 0249 4C 0253	      [3] 00185		jmp	L00001
 						  00186	; 00003	except:
 024C					  00187	L00000
 024C 20 06C3	      [6] 00188		jsr	PopExceptionContext
 						  00189	; 00004		pass	# except code suite
 024F 4C 0253	      [3] 00190		jmp	L00001
 						  00191	; 00005	finally:
 0252					  00192	L00002
 						  00193	; 00006		pass	# finally code suite
 0252 60		      [6] 00194		rts
 0253					  00195	L00001
 0253 20 0252	      [6] 00196		jsr	L00002

And the case in which the finally clause is not present.

 						  00177	; 00001	try:
 023B 20 06A5	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06BF	      [6] 00184		jsr	PopExceptionContext
 0249 4C 0252	      [3] 00185		jmp	L00001
 						  00186	; 00003	except:
 024C					  00187	L00000
 024C 20 06BF	      [6] 00188		jsr	PopExceptionContext
 						  00189	; 00004		pass	# except code suite
 024F 4C 0252	      [3] 00190		jmp	L00003
 						  00191	; 00005	else:
 0252					  00192	L00001
 						  00193	; 00006		pass	# else code suite
 0252					  00194	L00003

Note that an empty finally subroutine will be generated if the compiler detects in the try clause a return or a raise statement or a break or continue statement which transfers the flow of control out of the try clause.

Now consider the case in which else and finally are absent. This is one of the two minimal forms of structured exception handling allowed in Python and is perhaps the most common use.

 						  00177	; 00001	try:
 023B 20 06A2	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06BC	      [6] 00184		jsr	PopExceptionContext
 0249 4C 024F	      [3] 00185		jmp	L00001
 						  00186	; 00003	except:
 024C					  00187	L00000
 024C 20 06BC	      [6] 00188		jsr	PopExceptionContext
 						  00189	; 00004		pass	# except code suite
 024F					  00190	L00001

The other minimal form has only the try and finally clauses.

 						  00177	; 00001	try:
 023B 20 06B2	      [6] 00178		jsr	PushExceptionContext
 023E A9 4C		      [2] 00179		lda	#L00000&$FF
 0240 85 4F		      [3] 00180		sta	ExceptionHandler
 0242 A9 02		      [2] 00181		lda	#L00000>>8
 0244 85 50		      [3] 00182		sta	ExceptionHandler+1
 						  00183	; 00002		pass	# main code suite
 0246 20 06CC	      [6] 00184		jsr	PopExceptionContext
 0249 4C 025A	      [3] 00185		jmp	L00001
 024C					  00186	L00000
 024C 20 06CC	      [6] 00187		jsr	PopExceptionContext
 024F 20 0259	      [6] 00188		jsr	L00002
 0252 A6 52		      [3] 00189		ldx	Exception
 0254 A5 53		      [3] 00190		lda	Exception+1
 0256 4C 06AB	      [3] 00191		jmp	Raise
 						  00192	; 00003	finally:
 0259					  00193	L00002
 						  00194	; 00004		pass	# finally code suite
 0259 60		      [6] 00195		rts
 025A					  00196	L00001
 025A 20 0259	      [6] 00197		jsr	L00002

In this case, any exceptions are reraised after the finally clause has been executed.

Not too shabby for a dumb single-pass compiler…

1 Like

The test code has been slightly restructured to test the compilation of break and continue out of the try clause.

Choice = ' '
while Choice != '':
	try:
		print()
		print('Your choices are:')
		print('  0 - divide by zero')
		print()
		Choice = input('So which is it? ')

		if Choice == '0':
			A = 0
			print(6//A)
		elif Choice == '':
			break
		print('I do not understand.')
		continue
	except: # ZeroDivisionError:
		print('Caught ZeroDivisionError')
	else:
		print('In else')
	finally:
		print('Finally!')

A break statement now compiles to this if it is within the try clause.

 			  00374	; 00014				break
 03C5 20 08B2	      [6] 00375		jsr	PopExceptionContext
 03C8 20 043D	      [6] 00376		jsr	L00006
 03CB 4C 0461	      [3] 00377		jmp	L00003

However, the break and continue statements in a loop fully contained within a try clause like this one do not invoke the finally clause. That was somewhat tricky to implement.

try:
	Choice = ' '
	while Choice != '':
		print()
		print('Your choices are:')
		print('  0 - divide by zero')
		print()
		Choice = input('So which is it? ')

		if Choice == '0':
			A = 0
			print(6//A)
		elif Choice == '':
			break
		print('I do not understand.')
		continue
except: # ZeroDivisionError:
	print('Caught ZeroDivisionError')
else:
	print('In else')
finally:
	print('Finally!')

All in all it’s just another brick in the wall.

From the When it Rains, it Pours Department,

try:
	pass	# main code suite
except:
	pass	# except code suite
else:
	pass	# else code suite
finally:
	pass	# finally code suite

The documentation does not say that break and continue are not allowed in the except, else and finally clauses.

Compiling these statements within an except or else clause need not (and should not) pop the exception stack. But they should still invoke the finally clause.

Compiling these statements within the finally clause should do neither.

Edit: One final detail. Because it was compiled as a subroutine, any flow of control out of the finally code suite other than returning at the end must remove the return address from the stack. Discovered that one the hard way…