Overview
RW-Pioneer is a computer project inspired by Ben Eater's 8-bit breadboard computer project over at YouTube. I always wanted to build my own crappy computer, and the Pioneer was my first such attempt. The design is not perfect by any means, but it was the best one I could come up with after a few semesters of studying computer science.
Main Features
Simplicity was the primary design goal, which lead to a minimal yet functional architecture.
The main features of the RW-Pioneer are:
- 4-bit wide accumulator-based ALU with add and subtract instructions
- direct addressable 16 words of RAM
- up to 256 instructions per program
- operation at up to 500 hertz
- manual step clock for debugging
The architecture allows for multi-word operations by using the carry-flag of the last arithmetic instruction. Branch instructions use the zero-flag to decide the path of execution, allowing for loops in program code. The limited memory size is the biggest weakness of the design, and memory words are only directly addressable, making programming difficult. Lack of relative addressing also means, that the computer is not Turing-complete.
Instructions
Each instruction is one byte long. The leading word specifies the instruction code IC, the trailing word the instruction value IV. Other common abbreviations used when talking about the instructions of the RW-Pioneer:
Abbreviation | Meaning |
---|---|
ACC | Accumulator register |
BUS | Value on system bus |
RAM[x] | Word at index x in RAM |
CF | Carry flag value |
ZF | Carry flag value |
PCL | Program counter low word |
PCH | Program counter high word |
The following 16 instructions are implemented by the RW-Pioneer:
Instruction Mnemonic | Functional Description |
---|---|
LDI | ACC <= IV |
LDA | ACC <= RAM[IV] |
STOA | RAM[IV] <= ACC |
STOB | RAM[IV] <= BUS |
ADD | ACC <= ACC + RAM[IV] |
ADDI | ACC <= ACC + IV |
ADDC | ACC <= ACC + RAM[IV] + CF |
SUB | ACC <= ACC - RAM[IV] |
SUBI | ACC <= ACC - IV |
SUBC | ACC <= ACC - RAM[IV] - CF |
SJMP | PCL <= IV |
JMP | PCL <= IV and PCH <= ACC |
BNEZ | PCL <= IV and PCH <= ACC if ZF = LOW |
BEZ | PCL <= IV and PCH <= ACC if ZF = HIGH |
HALT | Halt program execution |
NOOP | Perform no operation |
Each instruction takes two clock cycles to finish. In the first cycle the instruction is fetched and decoded, in the second cycle the instruction is executed.
Hardware
I mostly built the computer from 7400 series logic ICs of the HCT and HC family. The SRAM IC (CY7C164) and the two Flash ICs (SST39SD010) are not part of the 7400 series, as well as the 555 timers. All other components are off-the-shelf capacitors, buttons, LEDs, 1N4001 diodes, through hole resistors, etc.
The design is not area efficient at all, but provides good visibility of integral signals for debugging. Not much thought was put into the layout regarding signal timings due to low operating speeds. To use as few integrated components as possible, microinstructions for the ALU and the Memory Unit are each decoded in a diode matrix. For interfacing with external components, a 20 pin connector in the upper right corner of the main PCB is used. The connector is based on the user port of the Commodore 64 and only provides low-level interfacing via memory mapping.
The PCB was manufactured by JLCPCB. Each component was hand soldered onto the board by me. Components like the instruction flash memory are seated in a socket, to be easily removable.
Verilog model and verification
Recently I bought myself the NEXYS A7 prototyping board to learn designing hardware in the Verilog HDL. To test my skills I modeled the RW-Pioneer and verified the design using Verilator. I was not able to copy the original design, as it used asynchronous memory access and a bus with multiple transcievers. Both of these design features are not compatible with FPGAs, and I had to work around these 'limitations'. In the end, I opted for a multiplexed bus design where the final bus value is OR connection of all bus drivers (only one driver is active at any cycle). For memory, the integrated block RAM of the Artix 7 FPGA is used. As this RAM is synchronous, another cycle was needed, resulting in three cycles per instruction.
As the model matured, I started to write a test bench using Verilator to quickly verify my changes. Verilator is way faster than traditional simulation in Vivado and saves quite a lot of development time in my case. With Verilator and GoogleTest I built unit tests to test all relevant instructions of the RW-Pioneer. Now every time I make a change to the model, I can simply run 'make test' and my changes are verified. Verilator can not cover all cases and unit tests don't always find any introduced bug, but this solution is still way better than debugging a design on the FPGA itself.
Overall, I am very pleased by the outcome of this small FPGA adventure. In the future, I want to design more capable processors in Verilog that can run compiled C code.
Resources
Schematics of the computer, the hexadecimal display module and the LCD module can be found here over at my GitHub together with a white paper further explaining my design. Additional software and a Verilog model are also provided there.
Video Gallery
Computing the 24th Fibonacci number, which is 46368 or 0xB520 in hexadecimal.
Displaying 'Frohe Weihnachten!', which translates to 'Merry Christmas!' on an LCD.
Computing the 24th Fibonacci number, which is 46368 or 0xB520 in hexadecimal. Running on the NEXYS A7 FPGA board.