Build Your Own Emulator
"Writing an emulator is the most fun way to learn computer architecture you didn't know you wanted to learn." -- every emulator author
An emulator is the closest you can come, in software, to building a CPU. You parse machine code, model registers and memory, decode instructions, emulate timers, render graphics, and handle input. CHIP-8 is the friendly first step (35 opcodes, monochrome 64x32 display, completable in a weekend). Game Boy is the serious project (300+ opcodes, banked memory, real games).
1. Overview & motivation
A CPU emulator is a fetch-decode-execute loop:
while running:
opcode = memory[PC]
PC += instruction_length
decode(opcode)
execute(opcode, operands)
update_timers()
render_if_needed()
handle_input()
What you can only learn by building one:
- Why instruction encoding matters -- you'll be reading hex dumps and recognising opcodes by sight.
- What flags (Z, N, H, C on Game Boy; carry/zero on most CPUs) actually do in practice.
- Why timing accuracy is the dividing line between "an emulator" and "an emulator that runs Mario".
- Why memory-mapped I/O exists (graphics, sound, input all share the address space).
- Why the fetch-decode-execute loop is fundamental -- modern CPUs do exactly this, with massive parallelism layered on top.
2. Where this fits in the degree
- Phase: Systems
- Semester: 4 (Systems Programming)
- Modules deepened: Module 1 (C/C++/Rust fundamentals), Module 3 (computer organization -- this is the module's perfect concretization).
Cross-phase relevance:
- Direct prerequisite for thinking about the Operating System tutorial -- emulating a CPU clarifies what an OS actually controls.
- Performance work transfers to the Compiler tutorial (writing a VM is half of writing a CPU emulator).
3. Prerequisites
- C/C++ or Rust. Comfortable with bit manipulation:
&,|,^,<<,>>. - Hex and binary fluency.
- A graphics library: SDL2 (recommended) or raylib for rendering and input.
You do not need a computer-architecture course beforehand. CHIP-8 is gentle enough to teach as you go.
4. Theory & research
Required reading -- CHIP-8
- Cowgod, "Cowgod's Chip-8 Technical Reference v1.0" -- devernay.free.fr/hacks/chip8/C8TECH10.HTM. The canonical CHIP-8 spec. All 35 opcodes, memory map, display, timers, input. Print it. â
- Tobias V. Langhoff, "Guide to making a CHIP-8 emulator" -- tobiasvl.github.io/blog/write-a-chip-8-emulator/. Modern, comprehensive, with footnotes on quirks.
Required reading -- Game Boy
- Pan Docs -- gbdev.io/pandocs/. The Game Boy reverse-engineering bible. 200+ pages. â canonical.
- GBDev wiki (gbdev.io) and gbdev community Discord.
Strongly recommended
- Patterson & Hennessy, Computer Organization and Design -- Chapters 4 (processor) and 5 (memory). Standard textbook. Read sections relevant to fetch/decode/execute.
- Imran Nazar, "GameBoy Emulation in JavaScript" -- imrannazar.com/series/gameboy-emulation-in-javascript. Older but excellent walkthrough.
For deep dives
- GekkioEK's Mooneye GB tests -- github.com/Gekkio/mooneye-test-suite. A test ROM suite for Game Boy emulator accuracy. If your emulator passes Blargg's CPU tests, it can run most games.
5. Curated tutorial list (from BYO-X)
- C: Home-grown bytecode interpreters
- C: Virtual machine in C
- C: Write your Own Virtual Machine -- Justin Meiners and Ryan Pendleton, "Write your Own Virtual Machine" -- LC-3, an educational architecture. Excellent.
- C: Writing a Game Boy emulator, Cinoop -- Code Slinger's GameBoy emulator tutorial
- C++: How to write an emulator (CHIP-8 interpreter) -- multigesture.net/articles/how-to-write-an-emulator-chip-8-interpreter/ â recommended primary for CHIP-8
- C++: Emulation tutorial (CHIP-8 interpreter)
- C++: Emulation tutorial (GameBoy emulator) -- codeslinger.co.uk
- C++: Emulation tutorial (Master System emulator)
- C++: NES Emulator From Scratch [video] -- javidx9's full series on YouTube -- 20+ episodes
- Common Lisp: CHIP-8 in Common Lisp
- JavaScript: GameBoy Emulation in JavaScript -- Imran Nazar â recommended primary for Game Boy
- Python: Emulation Basics: Write your own Chip 8 Emulator/Interpreter
- Rust: 0dmg: Learning Rust by building a partial Game Boy emulator
6. Recommended primary path
Two-stage path:
-
CHIP-8 first (1 weekend): Follow Tobias V. Langhoff's guide. Implement all 35 opcodes. Get it running the IBM logo ROM, then Tetris. Roughly 500 lines.
-
Then choose:
- Game Boy (3-6 weeks): Imran Nazar's tutorial. 300+ opcodes. Banked memory. Real games. This is the most rewarding emulator project, and your CHIP-8 experience makes it much more tractable.
- LC-3 (1 week): Justin Meiners' "Write your Own Virtual Machine". 16 opcodes, educational architecture, runs assembly programs.
- NES (6+ weeks): javidx9's video series. Hard. Includes PPU, sound. Spectacular results.
For this degree, the recommended sequence is CHIP-8 -> Game Boy.
7. Implementation milestones (CHIP-8)
Milestone 1: Memory, registers, opcode fetch
struct CHIP8 {
uint8_t memory[4096];
uint8_t V[16]; // V0..VF general registers
uint16_t I; // index register
uint16_t PC; // program counter, starts at 0x200
uint16_t stack[16];
uint8_t SP;
uint8_t delay_timer;
uint8_t sound_timer;
uint8_t display[64 * 32];
uint8_t keypad[16];
};
uint16_t fetch(CHIP8 *c) {
uint16_t op = (c->memory[c->PC] << 8) | c->memory[c->PC + 1];
c->PC += 2;
return op;
}
Evidence: Loading a 4-byte ROM into memory at 0x200; manually fetching opcodes one at a time.
Milestone 2: Decode and execute (35 opcodes)
CHIP-8 opcodes are 16-bit. Decode by high nibble:
void execute(CHIP8 *c, uint16_t op) {
uint16_t nnn = op & 0x0FFF;
uint8_t n = op & 0x000F;
uint8_t x = (op & 0x0F00) >> 8;
uint8_t y = (op & 0x00F0) >> 4;
uint8_t kk = op & 0x00FF;
switch (op & 0xF000) {
case 0x0000:
if (op == 0x00E0) { /* CLS -- clear display */ }
else if (op == 0x00EE) { /* RET */ }
break;
case 0x1000: c->PC = nnn; break; // JP addr
case 0x2000: /* CALL addr */ break;
case 0x3000: if (c->V[x] == kk) c->PC += 2; break; // SE
case 0x4000: if (c->V[x] != kk) c->PC += 2; break; // SNE
case 0x6000: c->V[x] = kk; break; // LD Vx, byte
case 0x7000: c->V[x] += kk; break; // ADD Vx, byte
case 0xA000: c->I = nnn; break; // LD I, addr
case 0xD000: /* DRW Vx, Vy, nibble -- sprite drawing */ break;
// ... 25 more
}
}
Evidence: Run each opcode through a unit test with known input/output.
Milestone 3: Display and sprite drawing
CHIP-8 has a 64x32 monochrome display. The DXYN opcode XORs an N-byte sprite into the display at position (Vx, Vy). VF is set if any pixel is erased (collision detection).
Render with SDL2: a 1D 64x32 array, one pixel-per-byte, scaled up to a 640x320 window.
Evidence: Load and run the IBM logo ROM (IBM Logo.ch8). Display the famous black-and-white IBM logo. If this works, your opcode dispatch is correct.
Milestone 4: Timers and audio
delay_timer and sound_timer both count down at 60 Hz. When sound_timer > 0, beep. Simple square wave through SDL.
Evidence: Run a beep-test ROM. Hear the beep.
Milestone 5: Input
CHIP-8 has a 16-key hex keypad (0-F). Map to a 4x4 grid on the keyboard.
Evidence: Run an interactive ROM (Tetris, Pong, Brix). Keys respond.
Milestone 6: Quirks
Several CHIP-8 opcodes have ambiguous spec. Different historical interpreters do different things; many roms only run correctly with one behavior. Langhoff's guide lists them. Implement as configurable flags.
Evidence: Test ROMs (chip8-test-suite from Timendus on GitHub) report which quirks each ROM expects, and your emulator can be configured for each.
Milestone 7 (Game Boy track): MMU, PPU, audio, MBC
For Game Boy: an order of magnitude more work.
- MMU -- memory management unit. Banked memory, hardware register mapping.
- CPU -- Sharp LR35902 (close to Z80). ~300 opcodes including CB-prefixed.
- PPU -- pixel processing unit. Tile-based graphics, sprites, scrolling.
- APU -- audio processing unit. Four channels.
- MBC -- memory bank controllers (MBC1, MBC3) for cartridges larger than 32 KB.
Plan for 200+ hours.
Evidence (Game Boy): Run cpu_instrs.gb from Blargg's test suite. If all 11 sub-tests pass, your CPU is accurate enough to run most games.
8. Tests & evidence
| Test | How |
|---|---|
| Opcode unit tests | Each of 35 (CHIP-8) or 300+ (GB) opcodes tested independently |
| IBM logo (CHIP-8) | Loads and displays correctly |
Blargg's cpu_instrs.gb | All 11 sub-tests pass |
| Quirk test ROM | All quirks correctly configured |
| Game ROM | Tetris (CHIP-8) or Tetris/Mario Land (GB) playable end-to-end |
| Frame timing | 60 Hz steady, ±1 ms |
The strongest single piece of evidence: a recording of a real game being played.
9. Common pitfalls
- Wrong byte order. CHIP-8 opcodes are big-endian (
memory[PC] << 8 | memory[PC+1]). Game Boy is little-endian. Get this wrong and nothing decodes correctly. - Forgetting
VF. Several CHIP-8 opcodes setVFas a side effect (carry, borrow, collision). Easy to miss. - Sprite-drawing wraparound. Some specs wrap, some clip at the screen edge. Get the convention right.
- Cycle counting. A real emulator advances by cycles, not by instructions. For CHIP-8 you can fudge it (run N ops per 60 Hz frame). For Game Boy, you must count.
- Flags in Game Boy. Z, N, H, C. The H (half-carry) flag has fiddly rules. Most emulator bugs are wrong H flags.
- Timing-sensitive code. Some games rely on cycle-exact behaviour. Don't chase 100% accuracy on your first emulator; document the limitation.
- Endianness in your codegen language. When you read a 16-bit value from
memory[PC]in C, you typically need to combine bytes explicitly, not cast.
10. Extensions
- Debugger -- single-step, breakpoints, register/memory inspector, disassembly view. Easy with SDL2 and ImGui.
- Save states -- serialize the entire emulator state. Restore later.
- Rewind -- keep the last N states; press a key to step backward. Frequently fun.
- Audio recording -- write a
.wavof the game's audio output. - Multiple system tracks -- once you have CHIP-8 done, NES is the spectacular next step. Then SNES (much harder), Genesis.
- Game Boy Color, Game Boy Advance -- extensions of the original GB pipeline.
11. Module integration
| Module | What the emulator deepens |
|---|---|
| Sem 4 Module 1 -- C/C++ fundamentals | Substantial project. Bit manipulation, file I/O, dynamic state. |
| Sem 4 Module 3 -- Computer organization | The definitive concretization. Every concept becomes a struct field. |
| Sem 4 Module 5 -- Abstraction & interpretation | A CPU emulator is structurally identical to a bytecode VM. |
| Compiler tutorial | The dispatch loop is the same; the instruction set is the difference. |
| Operating System tutorial | Knowing what a CPU does at the metal makes OS code much clearer. |
12. Portfolio framing
What to publish:
- Source organized by component (
cpu/,mmu/,display/,input/). - README with animated GIF or video of a game running. This is the single most important demo asset.
- A test suite (opcode unit tests + ROM-based tests).
- A list of which test ROMs your emulator passes.
What to keep private:
- Game ROMs. They're copyrighted. Never include them in the repo. Use freeware test ROMs only.
Reviewer entry points:
src/cpu/execute.c-- the opcode dispatch (the heart of the emulator).src/display.c-- sprite rendering.tests/blargg_cpu_instrs.md-- accuracy report.- README: include the game-running GIF/video; list passing test ROMs.
Emulators are striking portfolio pieces because the output is visual and visceral. "Here is my CHIP-8 running Tetris" reads spectacularly well.
13. Local source backbone
Use Programming a Toy Computer from Scratch (build-your-own/toy-computer) to connect the emulator project to a ground-up computer architecture path. This is especially useful before attempting Game Boy or NES, where timing and hardware state dominate.
| Local chunks | Use them for | Add to this project |
|---|---|---|
002-004 | Binary numbers, arithmetic, logic, hexadecimal | Add bit-manipulation drills before opcode decoding. |
005-014 | Logic gates, memory cells, buses, instructions, control circuits | Add a diagram of the emulated CPU datapath: registers, memory, ALU, PC, flags, bus. |
015-017 | Toy computer implementation and example programs | Implement a toy ISA before CHIP-8 if the learner struggles with CPU state. |
026-032 | Cortex-M registers, instruction set, vector table, first program | Optional hardware-flavored comparison: how a real embedded CPU differs from a toy VM. |
033-040 | Bytecode instructions and interpreter | Directly maps to the fetch-decode-execute loop. Use it to annotate the emulator main loop. |
041-063 | Timers, clock setup, display, keyboard, UART, interrupts, basic I/O | Expand emulator milestones with timer, input, display, and interrupt checklists. |
064-172 | Flash, UI, algorithms, implementation, compilation and tests for larger toy system pieces | Use as optional capstone material for building monitor/debugger, command editor, and storage. |
Extra checkpoints from the book chunks
- Datapath checkpoint: draw the movement of one instruction through fetch, decode, execute, and writeback.
- Timing checkpoint: separate CPU cycles, timer ticks, frame refresh, and input polling in the emulator loop.
- I/O checkpoint: show how keyboard/display state changes are represented in memory or registers.
- Debug checkpoint: add a stepper that prints PC, current opcode, registers, flags, and changed memory.
14. Deep project spec
Project contract
Build an emulator for a documented target. CHIP-8 is the default target; Game Boy or NES is an advanced target. The emulator must define CPU state, memory map, instruction decode, timers, input, display, halt/error behavior, and trace/debug mode.
Source-backed reading map
| Source ID | Use for | Required output |
|---|---|---|
build-your-own/toy-computer | machine-state discipline, instruction traces, memory maps | emulator trace contract and debugger commands |
Milestone map
| Milestone | Deliverable | Tests | Failure case |
|---|---|---|---|
| Machine state | registers, PC, memory, timers | reset-state snapshot | invalid memory access |
| Decoder | opcode table | decode fixtures for every opcode | unknown opcode trap |
| Execution | arithmetic, jumps, memory ops | instruction-level unit tests | PC update bug regression |
| Display/input | framebuffer and key state | render/input smoke tests | key wait does not spin forever |
| Timers | delay/sound timers | tick-rate tests | timer drift note |
| ROM runner | load and execute ROM | known ROM screenshot/trace | unsupported opcode reported |
| Debugger | step, breakpoint, inspect | transcript fixture | breakpoint at invalid address |
Test matrix
| Test type | Required examples |
|---|---|
| Unit | one fixture per opcode family |
| Golden | trace for a known small program |
| Integration | public test ROMs where legal/available |
| Visual | framebuffer snapshot or screenshot |
| Performance | cycles/tick rate and throttle behavior |
Design notes required
machine.md: registers, memory map, timers, display, input.instruction-set.md: opcode table, operands, side effects, PC behavior.timing.md: cycle model, timer rate, and simplifications.
Portfolio evidence
Publish one ROM trace, one screenshot, the opcode coverage table, debugger transcript, and a limitation note separating emulation correctness from cycle-perfect accuracy.
Source
This tutorial draws from the BYO-X catalog "Emulator / Virtual Machine" entry. Cowgod's CHIP-8 reference, Pan Docs for Game Boy, and Imran Nazar's JS GB tutorial are the canonical primary sources.