[Project Log] Python on the 6502/C64, 8080, 6800, 6809 and AVR

After the extensive work on exception handling, I indulged in a couple of rabbit holes.

The first is writing a “small” emulator to enable running programs written for the SWTPC 6800 on a system based on the 6809. This is possible because a “full boat” 6809 system contains 56K of RAM while the maximum is 40K for the 6800.

That gives me a whopping 16K of memory for the emulator. I map the I/O devices from $80xx on the 6800 to $E0xx on the 6809, and thunk calls to the FLEX operating system from $ADxx to $CDxx.

At this point, I can run the assembler and the BASIC and Extended BASIC interpreters on simple programs. Performance is about 1/10 of native speeds as expected.

The second is a bit more ambitious and has direct application to much of what I have been doing in this thread, writing compilers.

In the early days, I was obsessed with a programming language called PL/M. It was modeled after PL/I, a programming language from the IBM mainframe, but was intended to be used to program microprocessors. It was somewhat legendary because Gary Kildall reportedly used it to write his CP/M operating system.

I never got to program in PL/M. As far as I know, it was never available to actually run on CP/M, but it had to be used either on an expensive development system from Intel or some mini and mainframe computers.

PL/M is a fairly simple language. It has only two data types: byte and address. Address was the term for two-byte quantities. It was actually more of a high-level assembler than C, its minicomputer competitor. Arithmetic is always treated as unsigned. Adding or subtracting byte values result in a byte value. Mixing a byte and an address promotes the byte by zero extension; storing an address into a byte lops off the upper byte.

A great summary is here if you want to see more: https://www.autometer.de/unix4fun/z80pack/doc_cpm_plm.html

So with the technology I already have, I put together a compiler for a subset of PL/M. I targeted the 6800 processor because it is something I know well.

P$Test: do;

    declare (A0, A1, A2) address, (B0, B1, B2) byte;

    B0 = 1;
    B1 = 2 + B0;
    B1 = B1 + 1;

    B2 = B0 + B1;

    A0 = B0;

    B2 = 3 + A0;

    A0 = B0 + 1;

    A0 = B0 + 256;

    A0 = B0 + 257;

    B2 = A0;

    A1 = A0 + B1;

    A2 = B2 + A1;

    A0 = 1;
    A1 = 2 + A1;

    A2 = A0 + A1;

    A2 = A0 + 0;

    A2 = A0 + 1;

    A2 = A0 + 256;

    A2 = A0 + 257;

    A2 = 256 + B0;

    A2 = 257 + B0;

    A2 = 1 + 2;

    goto 0AD03h;

end P$Test;

compiles to

 					  00001	* P$Test: do;
 					  00002
 					  00003	         lib    ptest.dat
.					  00004
.0000				  00005	_PTEST.A0 rmb   2
.					  00006
.0002				  00007	_PTEST.A1 rmb   2
.					  00008
.0004				  00009	_PTEST.A2 rmb   2
.					  00010
.0006				  00011	_PTEST.B0 rmb   1
.					  00012
.0007				  00013	_PTEST.B1 rmb   1
.					  00014
.0008				  00015	_PTEST.B2 rmb   1
 					  00016
 0009				  00017	_PTEST
 					  00018	* 
 					  00019	*     declare (A0, A1, A2) address, (B0, B1, B2) byte;
 					  00020	* 
 					  00021	*     B0 = 1;
 0009 C6 01	      [2] 00022	         ldab   #1
 000B D7 06	      [4] 00023	         stab   _PTEST.B0
 					  00024	*     B1 = 2 + B0;
 000D D6 06	      [3] 00025	         ldab   _PTEST.B0
 000F CB 02	      [2] 00026	         addb   #2
 0011 D7 07	      [4] 00027	         stab   _PTEST.B1
 					  00028	*     B1 = B1 + 1;
 0013 D6 07	      [3] 00029	         ldab   _PTEST.B1
 0015 5C	      [2] 00030	         incb
 0016 D7 07	      [4] 00031	         stab   _PTEST.B1
 					  00032	* 
 					  00033	*     B2 = B0 + B1;
 0018 D6 06	      [3] 00034	         ldab   _PTEST.B0
 001A DB 07	      [3] 00035	         addb   _PTEST.B1
 001C D7 08	      [4] 00036	         stab   _PTEST.B2
 					  00037	* 
 					  00038	*     A0 = B0;
 001E D6 06	      [3] 00039	         ldab   _PTEST.B0
 0020 4F	      [2] 00040	         clra
 0021 97 00	      [4] 00041	         staa   _PTEST.A0
 0023 D7 01	      [4] 00042	         stab   _PTEST.A0+1
 					  00043	* 
 					  00044	*     B2 = 3 + A0;
 0025 D6 01	      [3] 00045	         ldab   _PTEST.A0+1
 0027 96 00	      [3] 00046	         ldaa   _PTEST.A0
 0029 CB 03	      [2] 00047	         addb   #3
 002B 89 00	      [2] 00048	         adca   #0
 002D D7 08	      [4] 00049	         stab   _PTEST.B2
 					  00050	* 
 					  00051	*     A0 = B0 + 1;
 002F D6 06	      [3] 00052	         ldab   _PTEST.B0
 0031 5C	      [2] 00053	         incb
 0032 4F	      [2] 00054	         clra
 0033 97 00	      [4] 00055	         staa   _PTEST.A0
 0035 D7 01	      [4] 00056	         stab   _PTEST.A0+1
 					  00057	* 
 					  00058	*     A0 = B0 + 256;
 0037 D6 06	      [3] 00059	         ldab   _PTEST.B0
 0039 86 01	      [2] 00060	         ldaa   #1
 003B 97 00	      [4] 00061	         staa   _PTEST.A0
 003D D7 01	      [4] 00062	         stab   _PTEST.A0+1
 					  00063	* 
 					  00064	*     A0 = B0 + 257;
 003F D6 06	      [3] 00065	         ldab   _PTEST.B0
 0041 4F	      [2] 00066	         clra
 0042 CB 01	      [2] 00067	         addb   #1
 0044 89 01	      [2] 00068	         adca   #1
 0046 97 00	      [4] 00069	         staa   _PTEST.A0
 0048 D7 01	      [4] 00070	         stab   _PTEST.A0+1
 					  00071	* 
 					  00072	*     B2 = A0;
 004A D6 01	      [3] 00073	         ldab   _PTEST.A0+1
 004C D7 08	      [4] 00074	         stab   _PTEST.B2
 					  00075	* 
 					  00076	*     A1 = A0 + B1;
 004E D6 01	      [3] 00077	         ldab   _PTEST.A0+1
 0050 96 00	      [3] 00078	         ldaa   _PTEST.A0
 0052 DB 07	      [3] 00079	         addb   _PTEST.B1
 0054 89 00	      [2] 00080	         adca   #0
 0056 97 02	      [4] 00081	         staa   _PTEST.A1
 0058 D7 03	      [4] 00082	         stab   _PTEST.A1+1
 					  00083	* 
 					  00084	*     A2 = B2 + A1;
 005A D6 08	      [3] 00085	         ldab   _PTEST.B2
 005C 4F	      [2] 00086	         clra
 005D DB 03	      [3] 00087	         addb   _PTEST.A1+1
 005F 99 02	      [3] 00088	         adca   _PTEST.A1
 0061 97 04	      [4] 00089	         staa   _PTEST.A2
 0063 D7 05	      [4] 00090	         stab   _PTEST.A2+1
 					  00091	* 
 					  00092	*     A0 = 1;
 0065 CE 0001     [3] 00093	         ldx    #1
 0068 DF 00	      [5] 00094	         stx    _PTEST.A0
 					  00095	*     A1 = 2 + A1;
 006A D6 03	      [3] 00096	         ldab   _PTEST.A1+1
 006C 96 02	      [3] 00097	         ldaa   _PTEST.A1
 006E CB 02	      [2] 00098	         addb   #2
 0070 89 00	      [2] 00099	         adca   #0
 0072 97 02	      [4] 00100	         staa   _PTEST.A1
 0074 D7 03	      [4] 00101	         stab   _PTEST.A1+1
 					  00102	* 
 					  00103	*     A2 = A0 + A1;
 0076 D6 01	      [3] 00104	         ldab   _PTEST.A0+1
 0078 96 00	      [3] 00105	         ldaa   _PTEST.A0
 007A DB 03	      [3] 00106	         addb   _PTEST.A1+1
 007C 99 02	      [3] 00107	         adca   _PTEST.A1
 007E 97 04	      [4] 00108	         staa   _PTEST.A2
 0080 D7 05	      [4] 00109	         stab   _PTEST.A2+1
 					  00110	* 
 					  00111	*     A2 = A0 + 0;
 0082 DE 00	      [4] 00112	         ldx    _PTEST.A0
 0084 DF 04	      [5] 00113	         stx    _PTEST.A2
 					  00114	* 
 					  00115	*     A2 = A0 + 1;
 0086 D6 01	      [3] 00116	         ldab   _PTEST.A0+1
 0088 96 00	      [3] 00117	         ldaa   _PTEST.A0
 008A CB 01	      [2] 00118	         addb   #1
 008C 89 00	      [2] 00119	         adca   #0
 008E 97 04	      [4] 00120	         staa   _PTEST.A2
 0090 D7 05	      [4] 00121	         stab   _PTEST.A2+1
 					  00122	* 
 					  00123	*     A2 = A0 + 256;
 0092 D6 01	      [3] 00124	         ldab   _PTEST.A0+1
 0094 96 00	      [3] 00125	         ldaa   _PTEST.A0
 0096 4C	      [2] 00126	         inca
 0097 97 04	      [4] 00127	         staa   _PTEST.A2
 0099 D7 05	      [4] 00128	         stab   _PTEST.A2+1
 					  00129	* 
 					  00130	*     A2 = A0 + 257;
 009B D6 01	      [3] 00131	         ldab   _PTEST.A0+1
 009D 96 00	      [3] 00132	         ldaa   _PTEST.A0
 009F CB 01	      [2] 00133	         addb   #1
 00A1 89 01	      [2] 00134	         adca   #1
 00A3 97 04	      [4] 00135	         staa   _PTEST.A2
 00A5 D7 05	      [4] 00136	         stab   _PTEST.A2+1
 					  00137	* 
 					  00138	*     A2 = 256 + B0;
 00A7 D6 06	      [3] 00139	         ldab   _PTEST.B0
 00A9 86 01	      [2] 00140	         ldaa   #1
 00AB 97 04	      [4] 00141	         staa   _PTEST.A2
 00AD D7 05	      [4] 00142	         stab   _PTEST.A2+1
 					  00143	* 
 					  00144	*     A2 = 257 + B0;
 00AF D6 06	      [3] 00145	         ldab   _PTEST.B0
 00B1 4F	      [2] 00146	         clra
 00B2 CB 01	      [2] 00147	         addb   #1
 00B4 89 01	      [2] 00148	         adca   #1
 00B6 97 04	      [4] 00149	         staa   _PTEST.A2
 00B8 D7 05	      [4] 00150	         stab   _PTEST.A2+1
 					  00151	* 
 					  00152	*     A2 = 1 + 2;
 00BA CE 0003     [3] 00153	         ldx    #3
 00BD DF 04	      [5] 00154	         stx    _PTEST.A2
 					  00155	* 
 					  00156	*     goto 0AD03h;
 					  00157
 00BF 7E AD03     [3] 00158	         jmp    $AD03
 					  00159	* 
 					  00160	* end P$Test;

But wait, there’s more.

PL/M was known for being a fairly good optimizing compiler for its time. My simplistic approach of generating code as I parse will not do.

Something I had always intended to do was to build a tree while parsing to represent the program and use that to generate object code. Separating parsing and code generation results in two moderately complicated pieces instead of a single monstrously complex beast. More importantly, it opens the door to insert an optimizer in between.

The PL/M compiler is simple enough to be a good test platform for building this.

What I have right now is a single-line compiler to convert an assignment statement into a tree, then generate assembly language from that.

 					  00012	**A0 = (B1 + 1) + A1;
 					  00013
 					  00014	*   *  0 := v A0 -> 1
 					  00015	*   *  1 L n 4
 					  00016
 					  00017	*      *  4 L n 2 -> 5
 					  00018
 					  00019	*         *  2 L v B1 -> 3
 					  00020	*         *  3 + c 1
 					  00021
 					  00022	*      *  5 + v A1
 					  00023
 					  00024
 					  00025	*  2 L v B1 -> 3
 					  00026	*  3 + c 1
 0010 D6 0D	      [3] 00027	         ldab   B1
 0012 5C	      [2] 00028	         incb
 					  00029	*  4 L n 2 -> 5
 					  00030	*  5 + v A1
 0013 4F	      [2] 00031	         clra
 0014 DB 07	      [3] 00032	         addb   A1+1
 0016 99 06	      [3] 00033	         adca   A1
 					  00034	*  1 L n 4
 					  00035	*  0 := v A0 -> 1
 0018 D7 05	      [4] 00036	         stab   A0+1
 001A 97 04	      [4] 00037	         staa   A0

The generated comment lines show the original line of source code, the parse tree and the applicable nodes as code is generated.

Now the fun begins: playing with optimization of the tree.

How applicable this will be to compiling Python remains to be seen, but this is something I had been wanting to do for quite a long time.

3 Likes