There's not much apparent difference between this and the last screenshot. But it's an important one. The cursor is blinking. It's blinking under software control. We have a working CPU!
I've now got a simple test program in ROM. It copies the screen data from ROM (doing an ASCII to CBM conversion on the way), then sits in a loop turning the cursor on and off. Here's the relevant part of the code:
Because I'm not attempting full compatibility with the original 6502, instruction timing is a little different. DEX is a single cycle. Branches take two cycles, whether they're taken or not. There's no penalty for crossing a page boundary. Thus the delay loop for blinking the cursor takes 3 cycles per iteration, and the loop count of 555,555 gives 3 blinks in 2 seconds at the C640's 5MHz.0000e4ea: 54 reset0000e4ea: 01a2 03e7 55 ldx.w #9990000e4ec: 20a9 000e 56 lda a1, #140000e4ee: 57 copyScreen0000e4ee: 00b4 e000 58 ldy initScreen,x0000e4f0: 10a5 e3e8 59 lda asciiToCBM,y0000e4f2: 0095 0400 60 sta $0400,x0000e4f4: 2095 d800 61 sta a1, $d800,x0000e4f6: 01ca 62 dex.w0000e4f7: 0010 00f5 63 bpl copyScreen0000e4f9: 64 loop0000e4f9: 00a9 0020 65 lda #320000e4fb: 0085 04f0 66 sta cursor-initScreen+$4000000e4fd: 02a2 0823 007a 67 ldx.l #5555550000e500: 68 delay10000e500: 02ca 69 dex.l0000e501: 00d0 00fd 70 bne delay10000e503: 00a9 00a0 71 lda #32+1280000e505: 0085 04f0 72 sta cursor-initScreen+$4000000e507: 02a2 0823 007a 73 ldx.l #5555550000e50a: 74 delay20000e50a: 02ca 75 dex.l0000e50b: 00d0 00fd 76 bne delay20000e50d: 80f0 00ea 77 bra loop
The VHDL is still very brute-force and very ugly. Only a handful of opcodes are supported (the ones needed for this very basic test program), and it's done through nested case and if statements, deciding what to do on each phase of each cycle for each individual opcode.
That's clearly not going to be a sustainable way of implementing the full CPU. A few things stand out - opcodes 85 (sta zp,0) and 95 (sta zp,x) are really the same instruction, but with different index registers. And b4 (ldy zp,x) only differs in destination register. A lot of the work that I'm currently cut-and-pasting between instructions could and probably should be shared.
It would be good for development to put most of the complicated parts into a microcode ROM. That way changes can be made, and new instructions implemented, by simply building a new ROM and inserting it into the bitstream, leaving the VHDL alone. When the time comes to develop software for the ROM, it will be a relief to avoid the increasingly lengthy VHDL synthesis whenever possible.