Sunday, November 10, 2019

First Instructions


There's not much apparent difference between this and the last screenshot.  But it's an important one.  The cursor is blinking.  It's blinking under software control.  We have a working CPU!

I've now got a simple test program in ROM.  It copies the screen data from ROM (doing an ASCII to CBM conversion on the way), then sits in a loop turning the cursor on and off.  Here's the relevant part of the code:
0000e4ea:                         54 reset
0000e4ea: 01a2 03e7               55 ldx.w #999
0000e4ec: 20a9 000e               56 lda a1, #14
0000e4ee:                         57 copyScreen
0000e4ee: 00b4 e000               58 ldy initScreen,x
0000e4f0: 10a5 e3e8               59 lda asciiToCBM,y
0000e4f2: 0095 0400               60 sta $0400,x
0000e4f4: 2095 d800               61 sta a1, $d800,x
0000e4f6: 01ca                    62 dex.w
0000e4f7: 0010 00f5               63 bpl copyScreen
0000e4f9:                         64 loop
0000e4f9: 00a9 0020               65 lda #32
0000e4fb: 0085 04f0               66 sta cursor-initScreen+$400
0000e4fd: 02a2 0823 007a          67 ldx.l #555555
0000e500:                         68 delay1
0000e500: 02ca                    69 dex.l
0000e501: 00d0 00fd               70 bne delay1
0000e503: 00a9 00a0               71 lda #32+128
0000e505: 0085 04f0               72 sta cursor-initScreen+$400
0000e507: 02a2 0823 007a          73 ldx.l #555555
0000e50a:                         74 delay2
0000e50a: 02ca                    75 dex.l
0000e50b: 00d0 00fd               76 bne delay2
0000e50d: 80f0 00ea               77 bra loop
Because I'm not attempting full compatibility with the original 6502, instruction timing is a little different.  DEX is a single cycle.  Branches take two cycles, whether they're taken or not.  There's no penalty for crossing a page boundary.  Thus the delay loop for blinking the cursor takes 3 cycles per iteration, and the loop count of 555,555 gives 3 blinks in 2 seconds at the C640's 5MHz.

The VHDL is still very brute-force and very ugly.  Only a handful of opcodes are supported (the ones needed for this very basic test program), and it's done through nested case and if statements, deciding what to do on each phase of each cycle for each individual opcode.

That's clearly not going to be a sustainable way of implementing the full CPU.  A few things stand out - opcodes 85 (sta zp,0) and 95 (sta zp,x) are really the same instruction, but with different index registers.  And b4 (ldy zp,x) only differs in destination register.  A lot of the work that I'm currently cut-and-pasting between instructions could and probably should be shared.

It would be good for development to put most of the complicated parts into a microcode ROM.  That way changes can be made, and new instructions implemented, by simply building a new ROM and inserting it into the bitstream, leaving the VHDL alone.  When the time comes to develop software for the ROM, it will be a relief to avoid the increasingly lengthy VHDL synthesis whenever possible.