Saturday, June 6, 2020

The ALU

The 65020's ALU is its most complex component.  It must be able to add and subtract in both binary and decimal modes, perform logical and, or, and exclusive or, shift and rotate values arbitrary distances, and also set, clear, toggle, and test individual bits.  All of these operations must work on 8, 16, or 32 bit values, and produce appropriate values for the flags.

To do all of this, I've broken it down into three smaller sub-components.


The A input is usually the first operand, and B is usually the second operand.  B can be optionally shifted (not shifting is the same as shifting by 0) or inverted.  The two operands go into the Add/Logic sub-component, which can do binary or BCD addition, or logical and/or/exclusive or.  Subtraction is done by inverting the second operand before adding it to the first.

So ADC A0, #123 is handled by sending the contents of A0 to input A, the constant 123 to input B, setting the shift input to 0, not inverting it, and then adding the two.

SBC A0, $1234 has A = A0, B = value from memory, shift = 0, B inverted, then adding.

The various shift and rotate instructions take their operand on the B input.  A is set to 0, and the adder/logic unit performs an OR, to pass the shifted result to the output.

LDA doesn't look like an instruction that uses the ALU, but I have it using the same microcode and passing the loaded value through the ALU.  A is set to 0, there there is no shift or invert, and the Add/Logic sub-component performs an OR.  This send the loaded value through unchanged, but allows flags to be set.

CLC also doesn't look like an ALU instruction.  But on the 65020, it's a special-case of a more general set of bit-clearing instructions.  These have the bit number encoded in the instruction, and the bit to be cleared in any register, or in memory.  The A input takes the register or memory value, B is set to 1, and the bit number goes to the shift input.  The shifter output is inverted, and then ANDed with the value.  SEC is done in a very similar way, but using OR and not inverting the shifter output.

The Inverter

The inverter is a very simple component, but there are a couple of interesting points.  If we didn't have to support decimal mode, it would be a simple array of XOR gates, with one input of each connected to the 'invert' control signal.

To support BCD subtraction, it also needs to be able to give us the 9's complement of a value: this is obtained by subtracting each nybble from 9.  At first glance, this is simple: take the input 4 bits at a time, add the 'invert' and 'decimal' control signals, and that's a perfect fit for the Spartan 6's 6-input LUTs.  A 32 bit inverter will take 32 LUTs, or 8 slices.

But we can do better than that.  Each LUT actually has two outputs, allowing two functions of 5 and 6 inputs respectively.  If we can use only five inputs, we can have two independent functions implemented in a single LUT.

input   binary  decimal 

0000    1111    1001
0001    1110    1000
0010    1101    0111
0011    1100    0110

0100    1011    0101
0101    1010    0100
0110    1001    0011
0111    1000    0010

1000    0111    0001
1001    0110    0000
1010    0101    1111
1011    0100    1110

1100    0011    1101
1101    0010    1100
1110    0001    1011
1111    0000    1010

Examining the truth table of the inverter, it is apparent that the bottom two bits of the output depend on only the bottom two bits of the input.  And the top two bits of the output depend on only the top three.  Add the two control lines, and that's 4 inputs for one pair of outputs and 5 for the other pair.  That means we can do the whole inverter in only 16 LUTs, or 4 slices.