Wednesday, June 27, 2018

Getting Commodore 64 BASIC to run

The plan was to use the source for the Commodore 64's ROM (https://github.com/mist64/cbmsrc) as a test for the assembler.  If I could get a 100% matching binary, then I could be reasonably confident that the assembler was working correctly, for 8-bit code at least.  I could then start optimising parts of it to get a feel for how well the extended instruction set works, which bits are useful, and what is missing.

This plan immediately ran into a problem, which in hindsight should have been obvious.

With the 65020, a byte contains 16 bits.  Every address used by the Commodore 64 ROM is (at most) 16 bit.  So instructions that need two byte operands in the original get assembled to only one byte.  It doesn't take long for the binary to get out of sync.

However, this approach was still good enough to find and fix a number of simple assembler bugs.  Feeding the result to the simulator gave this:

Commodore 64 start-up screen, "64K RAM SYSTEM  51199 BASIC BYTES FREE"
Great!  But isn't it supposed to be 38911 BASIC BYTES FREE?  It turns out that BASIC starts with a memory test.  It checks every location until it finds something that isn't RAM, and assumes it can use all of it.  My simulator didn't distinguish ROM from RAM, so it kept going until it hit I/O space at $D000.  Write-protecting the ROMs at $A000-BFFF and $E000-$FFFF fixed this.

Now we've got a fully working BASIC.  Except we don't.  It's not much use if you can't type programs in and run them.  So I added some I/O handler code to convert Windows keyboard scan codes into the Commodore 64 keyboard matrix, and return the appropriate values when $DC00 and $DC01 were accessed.

Now we can run a real test

Commodore 64 start-up screen with the program 10 PRINT"HELLO WORLD", followed by ?SYNTAX ERROR
Entering anything would give a syntax error.  That's not how it's supposed to go.

The nice thing about trying things out in a software simulator rather than jumping straight to hardware is that you can create useful debugging tools.  The simulator already had an instruction trace feature, printing out every instruction that is executed, along with the contents of some of the registers (PC, SP, P, A0, X0, Y0).  This quickly revealed the problem.

BASIC uses a small routine called CHRGET, which is copied into zero-page memory.  Here's the important part:
INITAT  INC CHRGET+7
        BNE CHDGOT
        INC CHRGET+8CHDGOT
        LDA 60000
It uses self-modifying code to increment the two bytes of the address (60000 is just the initial value in the source code, used to force absolute addressing mode.  It gets set to different values later).  But on the 65020, 60000 is a one-byte quantity.  That LDA gets assembled with zero page addressing.

Another easy fix, if a little hacky:
CHDGOT  .BYTE $AD, $00, $00
Now I can type in programs.  But it's annoying to type them in every time.  I need to be able to save programs and load them back in.  Again, this is the advantage of a software simulator.  I don't have to emulate Commodore's serial bus and floppy drive, or the tape drive, or anything like that.  I can simply replace the LOAD and SAVE kernal routines with STA LOADTRIGGER and STA SAVETRIGGER, writing to unused I/O locations.  The I/O handler traps these, reads register values from the CPU, and loads or saves chunks of memory to or from a regular file on my PC.

Now, some more tests.  PRINT 1+1 says 2.  PRINT 2*2 says 4.  PRINT 1/10 says 5.95173333E-09

What's going on there?  The floating point divide routine builds the result bit-by-bit.  It starts with A set to 1, and shifts in bits of the result one at a time.  When the 1 bit gets shifted out, the partial result is written to a temporary buffer.  Here's the code that does this:
        LDX #253-ADDPRC
        LDA #1DIVIDE        ... ; do the compare
SAVQUO  PHP
        ROL A
        BCC QSHFT
        INX
        STA RESLO,X
        BEQ LD100
RESLO is the last byte of the buffer.  X is initialised to -4, so the first byte is written to the start of the buffer.  It is incremented each time a byte is written, and when it reaches 0 the loop ends.  This relies on RESLO being in page 0, and zero-page indexed addressing wrapping if RESLO+X is greater than 255.  That's the main incompatibility between the 6502 and 65020.  On the 65020, indexing never wraps.  The solution again is a small change to the source:
        LDX.L #$FFFFFFFD-ADDPRC
        ...
        INX.L 
65020 indexing always uses all 32 bits of the register, so we must load X0 with a 32 bit version of -4, and do a 32 bit increment.

The simulator isn't meant to be a full Commodore 64 emulator, but I couldn't resist adding support for bitmap mode.  Typing in the example from the Programmers Reference Guide gives me this
A "high resolution" sine curve in black, on a cyan background
That's a good place to stop for now.  I have a collection of Commodore's public domain software as a set of .d64 files.  I plan to extract the individual programs and use them as further tests.  I want to play ARTILLERY again!  After that, I'll finally get back to the plan, and start optimising parts of the ROM using the 65020's extended features.  And then, finally, the FPGA.

1 comment: