(originally posted here)

8008.zip

image

Introduction

I wanted to make an 8-bit CPU, decided to work based off of an existing design that was decently simple (and fun!) to implement. Worked mainly off of this data sheet. There’s probably optimizations to be done and it would absolutely be easier with some kind of externally programmable memory, but it is functional enough to run the test program properly. The data sheet includes some information about peripherals, but I just whipped up a couple chips for making testing easier. This took me more than a week so I wasn’t trying to go overboard, and the rest seems trivial compared to the microprocessor itself. I am a few steps from the MCS-8, though without ROM/RAM of course.

About the 8008

The 8008 has 7 registers (A, B, C, D, E, H, L) and a virtual register for accessing memory. The virtual register (“M”) contains the value at the memory address specified by H and L (high and low bytes of the address, respectively, though the most significant 2 bits of H are unused since memory addresses are only 14 bits). The ALU will always operate on an operand and the accumulator (A register) then store the result back into the accumulator. To do this, the 8008 has two special purpose registers (a and b but that’s confusing so I call them α and β) for internal communication. There are three “groups” which are each multiplexed: the 8 register stack and 7 “scratch pad” registers are what I call the “Memory Group”, the special purpose registers and the ALU which I call the “ALU Group” and then the instruction register, decoders, and state control, which I call the “Decode Group”.

Actual Architecture ![image](https://user-images.githubusercontent.com/21225408/218195782-72db76d7-6044-4b52-b1b6-c73209a7395c.png) From [here](https://en.wikipedia.org/wiki/Intel_8008)
My Implementation ![image](https://user-images.githubusercontent.com/21225408/218196254-a049b8d8-f775-4533-92af-c72dfdfd3f05.png) ![image](https://user-images.githubusercontent.com/21225408/218196633-a3fd63f6-3dda-4e3a-9655-0a0f390754e0.png) ![image](https://user-images.githubusercontent.com/21225408/218196672-2a3a2091-d374-4208-9183-6a92a879255e.png) ![image](https://user-images.githubusercontent.com/21225408/218196722-41141700-c619-443c-97d7-0db3ffcf97f5.png)

Using it

It is currently functional, with a few known issues:

  • RST command just doesn’t work properly at all, i’ve just been using JMP instead. 00 AAA 101 is equivalent to 01 XXX 110 00 AAA 000 00 000 000 anyway.
  • INP/OUT commands aren’t tested, probably don’t work.
  • If the clock is too fast, the bus gets crowded and things stop working

It can run the program in appendix 3A though, if you’re patient enough to be the memory. If you’d like to take a shot, open up this sample memory I’ve transcribed from the data sheet:

"Sample Program to Search A String Of Characters In Memory Locations 200-219 For A Period"

``` Address Hex Bin Comment 00 00 ( 0) | 64 | 01 100 100 | JMP (unconditional jump) 00 01 ( 1) | 64 | 01 100 100 | address lower: 0x64 (100) 00 02 ( 2) | 00 | 00 000 000 | address higher: 0x00 (0) ``` ``` Address Hex Bin Comment 00 3C ( 60) | 30 | 00 110 000 | INL (increment L (110) register) 00 3D ( 61) | 0B | 00 001 011 | RFZ (return if Zero flag is false) 00 3E ( 62) | 28 | 00 101 000 | INH (increment H (101) register) 00 3F ( 63) | 07 | 00 000 111 | RET (unconditional return) ``` ``` Address Hex Bin Comment 00 64 (100) | 36 | 00 110 110 | LLI (load immediate into L (110) register) 00 65 (101) | C8 | 11 001 000 | address lower: 0xC8 (200) 00 66 (102) | 2E | 00 101 110 | LHI (load immediate into H (101) register) 00 67 (103) | 00 | 00 000 000 | address higher: 0x00 (0) 00 68 (104) | C7 | 11 000 111 | LAM (load from memory into A (000) register) 00 69 (105) | 3C | 00 111 100 | CPI (compare A to immediate) 00 6A (106) | 2E | 00 101 110 | ASCII '.' 00 6B (108) | 68 | 01 101 000 | JTZ (jump if Zero flag is true, essentially "jump if equal") 00 6C (109) | 77 | 01 110 111 | address lower: 0x77 (119) 00 6D (110) | 00 | 00 000 000 | address higher: 0x00 (0) 00 6E (111) | 46 | 01 000 110 | CAL (call subroutine) 00 6F (112) | 3C | 00 111 100 | address lower: 0x3C (60) 00 70 (113) | 00 | 00 000 000 | address higher: 0x00 (0) 00 71 (114) | C6 | 11 000 110 | LAL (load L (110) register into A (000) register) 00 72 (115) | 3C | 00 111 100 | CPI (compare A to immediate) 00 73 (116) | DC | 11 011 100 | address lower: 0xDC (220) 00 74 (117) | 48 | 01 001 000 | JFZ (jump if Zero flag is false, essentially "jump if not equal") 00 75 (118) | 68 | 01 101 000 | address lower: 0x68 (104) 00 76 (119) | 00 | 00 000 000 | address higher: 0x00 (0) 00 77 (120) | 07 | 00 000 111 | RET (unconditional return) ``` ``` Address Hex Bin ASCII 00 C8 (200) | 54 | 01 010 100 | T 00 C9 (201) | 45 | 01 000 101 | E 00 CA (202) | 53 | 01 010 011 | S 00 CB (203) | 54 | 01 010 100 | T 00 CC (204) | 2E | 00 101 110 | . 00 CD (205) | 00 | 00 000 000 | null ... 00 DB (220) | 00 | 00 000 000 | null ```

How to use:

  1. Open up the chip “demo”
  2. Bring RESET to high for a few seconds then bring it low again.
  3. When the output Wait is high, read the bottom right output * If Instruction or Read, enter in the byte at the displayed memory address and bring Ready to high. * If Write, write the byte shown on the top right to the displayed memory address and bring Ready to high. * If Command I/O, I’m not sure what to do since I haven’t implemented INP and OUT properly.
  4. Once Wait is low, repeat step 3.

Highlights

Scratch Pad

image

Register 000 is the Accumulator, which is a shift register, as opposed to 001-110 which are increment/decrement registers. Register 111 doesn’t exist and stands in for memory access. I needed a indexing system to put those addresses together in the scratch pad, but still uniquely index all the addresses properly. So, first division is by A2 xor A1, then the second is A1 xor A0, then the last is just the last digit, like a normal address. This put 000 and 111 in the same region! More specifically:

Register Address A2 xor A1 A1 xor A0 A0 Functional Address
A 000 0 0 0 000
B 001 0 1 1 011
C 010 1 1 0 110
D 011 1 0 1 101
E 100 1 0 0 100
H 101 1 1 1 111
L 110 0 1 0 010
M 111 0 0 1 001

Stack Pointer

image

The 8008 has an 8 byte internal stack. Instead of going in order (push 001 to 010, pop 001 to 000), it uses a de Bruijn sequence. I was able to implement that with the logic here. This “Shifter” component will therefore produce 8 addresses in sequence, in either direction. Pretty neat!

Hex Display

image

I used logic friday to design a component for printing hex values on the 7-segment displays instead of just decimal ones. This makes multi-byte-sized data readable without needing to double dabble or anything.

Instruction Decode

image

This one got kinda gnarly, an 8008 instruction can take up to three cycles depending on how much data it needs. Each cycle has up to 5 states. So, every instruction needs behavior in any of those states that it’ll hit.