Saturday, July 27, 2019

It's Back!

It's been a while.

The work last year had taken the design as far as it needed to go, and the simulator had done its job in proving that it all fitted together.  It was time to turn to hardware.  My chosen platform is the Papilio Duo from Gadget Factory - a Spartan 6 FPGA coupled with an Arduino and a 2MB static RAM.  The large SRAM was attractive (it's much easier to interface than the usual DDR), and it has a "Classic Computing Shield" add-on with all the necessary ports to turn it into a classic 1980s computer.  I'm ignoring the Arduino side of it.

With great enthusiasm I got started.  And that's when I hit a very solid brick wall.

It didn't take much work to get a basic VGA display

There's a little more going on in this photo that it might appear, but also a lot less.  The FPGA contains three main components: a clock generator, a memory controller (including character ROM), and a video generator.

The clock generator multiplies the Papilio Duo's 32MHz oscillator up to 80MHz (I have since increased this to 160MHz, for reasons that will be explained below).  Why such a high frequency?  I need a 40MHz pixel clock for 800x600 VGA (which becomes 640x400 with a border), so it makes sense to start with a multiple of that.  The pixel clock becomes 20MHz in 320 mode, and with 8 pixels per system clock (like the Commodore 64), that implies a 5MHz system clock.

In a real, and by this point rather impractical, mid 1980s computer, we would have the CPU and VIC both accessing memory every cycle.  VIC has two data busses, so each 5MHz cycle needs to support three independent 16 bit accesses.  Since the SRAM on the Papilio Duo is only 8 bits wide, that means six memory accesses per cycle, which I've rounded up to 8.  The extra slot might get used for some kind of DMA in the future.

My original plan for accessing the SRAM needed two clock cycles per access (write need /WE to be low and then high, so it can't be done in one), so that works out to 80MHz.

But this is still pretending to be an old computer, and old computers didn't do anything at 80MHz.  I didn't want to attempt a design with multiple clock domains on my first serious FPGA project, so I'm using a single clock which can be gated by the different modules.  These clock enable signals are also generated by the clock generator.  There are two for the CPU, reflecting the two phases of the clock, and two for VIC.

The next module is the memory manager.  This takes memory access requests from the CPU and VIC, and translates them into the right signals for the SRAM.  It also contains a character bitmap ROM stored in an FPGA BRAM.

Finally, there is the video generator, VIC.  At the moment it is a very simple design, just generating VGA timing signals and reading a bitmap from memory.  For this screenshot, it's configured to read from character ROM.

The next obvious step is to get SRAM working.  My plan was to build a very simple fake CPU that simply copied character ROM into RAM, then get VIC to read from RAM instead of ROM.  That's when it all fell apart.  It didn't work, and nothing I tried could change that.  Motivation drained away, I moved onto other things, and the project was stalled.


But then... I recently bought a Digilent Digital Discovery.  It does a number of things, but for me the most important function is the 32 channel logic analyser.  Being able to see the real signals on the real hardware should make debugging this thing possible.

And it did!  After a little bit of work, I discovered a number of problems.  First, a bit of re-jigging in VIC had resulted in it always displaying a blank screen no matter what data it was getting from memory.  Oops.  I had also been rather optimistic in the way I was writing to SRAM.

The original design presented address and data, then pulled /WE low for a cycle, then returned it high.  The datasheet suggested that this might work, as the relevant setup and hold times were all zero.  But changing outputs on an FPGA and receiving those as inputs on the SRAM are different things.  I couldn't guarantee that the address or data weren't changing a little bit later than /WE, and the tracks on the PCB were definitely not all the same length.

That prompted the clock doubling.  The FPGA is now running at 160MHz, giving me four clocks for each memory access.  That lets me stagger the signals in a way that has a better chance of fitting the timing.

And it almost does.  Here's what I get now

Can you spot the difference?  Those pixels in the bottom right of each character are written by the fake CPU as it copies data, so I can be sure that VIC is fetching data from RAM.  But zoom in closer and look at the right hand side of the Hs.  There are a few missing pixels there.

It's worse in real time.  Pixels flicker on and off all over the screen.  It starts out OK, and gets worse as the chips warm up.  Clearly I'm not quite meeting some timing somewhere.  I added a third phase to the CPU to test this: now it reads ROM, writes to RAM, then reads RAM and compares.  If the result is different, it turns on an error LED.  At full speed the LED is always on.  If I reduce the clock speed, there are no errors at all.

So that's where it is now.  There's a little more work to do on the memory controller, because there's no point trying to continue if I can't trust SRAM writes to work.  Then next step will be making VIC a little bit closer to the real design, using DMA to read character pointers and colours.  And then, with the ability to display proper data, I can finally start work on making a real CPU.