Build Your Own Compiler
"If you can implement an interpreter, you can implement a compiler -- it's the same pipeline with an extra stage." -- almost true
A compiler turns source code into a lower-level form: bytecode for a VM, assembly for a CPU, or another language. After completing the Interpreter tutorial, this project takes the same source language and compiles it to bytecode that runs 10-100x faster.
This is the most ambitious project in the Systems phase. Plan for 4-8 weeks.
1. Overview & motivation
A compiler pipeline:
source -> [lexer] -> tokens -> [parser] -> AST -> [analyzer] -> typed AST
-> [IR generator] -> IR -> [optimizer] -> IR
-> [codegen] -> bytecode or assembly
For this tutorial: target a stack-based virtual machine (Crafting Interpreters Part III) or, more ambitious, x86-64 assembly (compiler with no LLVM).
What you can only learn by building one:
- Why stack machines are easier than register machines for codegen, and why production VMs (V8, JVM) use register machines anyway.
- Why single-pass compilation is hard once you have forward references and types.
- Why garbage collection matters -- your VM needs one.
- Why constant folding and dead code elimination are easy optimizations with disproportionate impact.
- Why optimization is the largest part of any production compiler -- and the easiest part to skip without losing the point.
2. Where this fits in the degree
- Phase: Systems
- Semester: 4 (Systems Programming)
- Modules deepened: Module 1 (C/Go/Rust), Module 3 (computer organization -- codegen requires understanding the target machine), Module 5 (abstraction & interpretation -- capstone of the module).
Cross-phase relevance:
- Builds on the Interpreter tutorial.
- Architectural lessons from compiler design (multiple passes, IR, optimization) transfer to the Web Browser Engine (CSS -> style -> layout is also a multi-pass pipeline) and Database (relational) (SQL -> logical plan -> physical plan).
3. Prerequisites
- Complete the Interpreter tutorial first. This tutorial assumes you have a working tree-walking interpreter and want to make it faster.
- Comfort with bytecode (or assembly, if doing native codegen).
- Basic familiarity with stack machines (JVM, CPython bytecode, WebAssembly).
4. Theory & research
Required reading
- Robert Nystrom, Crafting Interpreters, Part III -- craftinginterpreters.com. Builds a bytecode VM compiler for Lox in C. Roughly 2,500 lines. â the canonical text for this project.
- Thorsten Ball, Writing A Compiler In Go -- compilerbook.com. Builds on his interpreter book. Stack-based VM. Excellent.
Strongly recommended
- Aho, Sethi, Ullman, Compilers: Principles, Techniques, and Tools ("Dragon Book", 2nd edition) -- Chapters 6 (intermediate code), 8 (code generation), 9-10 (optimization). The standard reference.
- Appel, Modern Compiler Implementation in {C, Java, ML} -- alternative textbook. Better on register allocation.
- Engineering a Compiler -- Cooper & Torczon. The third major textbook. Strong on optimization.
For specific topics
- Lattner & Adve, "LLVM: A Compilation Framework" (2004) -- the canonical SSA-based compiler infrastructure.
- Aycock, "A Brief History of Just-In-Time" (ACM, 2003) -- short paper on JIT compilation.
What to skip on the first pass
- LLVM. Wonderful for production work; obscures what you're trying to learn.
- Full optimization passes (loop unrolling, vectorization). Stick to constant folding, dead-code, peephole.
5. Curated tutorial list (from BYO-X)
This corresponds to the "Programming Language" section, the higher-effort entries (interpreter-only is the previous tutorial):
- Assembly: Jonesforth -- github.com/nornagon/jonesforth -- a self-hosting Forth in x86 assembly. Heavy stuff; for very ambitious learners.
- C: Baby's First Garbage Collector -- Robert Nystrom -- the GC half of the compiler project.
- C: Writing a Simple Garbage Collector in C
- C: A C & x86 version of the "Let's Build a Compiler" by Jack Crenshaw -- Jack Crenshaw's original series (Pascal, 1988) and modern ports
- C: A journey explaining how to build a compiler from scratch
- C++: Writing Your Own Toy Compiler Using Flex
- C++: How to Create a Compiler [video]
- C++: Kaleidoscope: Implementing a Language with LLVM -- LLVM tutorial -- if you want to learn LLVM specifically
- F#: Understanding Parser Combinators
- Elixir: Demystifying compilers by writing your own [video]
- Go: The Super Tiny Compiler -- jamiebuilds/the-super-tiny-compiler -- Lisp-to-C in 200 lines, heavily commented
- Go: Lexical Scanning in Go [video] -- Rob Pike's talk
- Haskell: Let's Build a Compiler, Write You a Haskell, Write Yourself a Scheme in 48 Hours, Write You A Scheme
- Java: Crafting interpreters: A handbook for making programming languages -- Part III â recommended primary
- Java: Creating JVM Language
- JavaScript: The Super Tiny Compiler
- OCaml: Writing a C Compiler -- Nora Sandler's book â for "compile to assembly" path
- OCaml: Writing a Lisp, the series
- Pascal: Let's Build a Compiler -- Jack Crenshaw's original
- Python: From Source Code To Machine Code: Build Your Own Compiler From Scratch -- covers x86 codegen
- Racket: Beautiful Racket: How to make your own programming languages with Racket -- beautifulracket.com
- Ruby: A Compiler From Scratch
- Rust: Learning Parser Combinators With Rust
- TypeScript: Build your own WebAssembly Compiler
6. Recommended primary path
There are two clearly distinct paths. Pick one before starting.
Path A: Bytecode VM (recommended)
Robert Nystrom, Crafting Interpreters, Part III (Java/C). You compile Lox to bytecode and execute on a stack-based VM you also build. Includes a mark-sweep GC, hash table, single-pass Pratt parser. ~2,500 lines of C.
This is the right path for most learners. You get a real, executable language with realistic performance.
Path B: Compile to native assembly
Nora Sandler, Writing a C Compiler (OCaml/Python). Compiles a subset of C to x86-64 assembly. Two-volume book. Very rigorous.
This is the right path if you want to engage with register allocation, calling conventions, and the actual machine. Substantially more work.
For this degree, Path A is the default. Path B is an excellent capstone-tier project if you have the time.
7. Implementation milestones (Path A: bytecode VM, following Nystrom Part III)
Milestone 1: Chunk and OpCodes
A chunk is a flat byte array of opcodes and operands. Operands index into a constant pool.
typedef enum {
OP_CONSTANT, // operand: constant index
OP_ADD, OP_SUBTRACT, OP_MULTIPLY, OP_DIVIDE,
OP_NEGATE,
OP_RETURN,
} OpCode;
typedef struct {
int count, capacity;
uint8_t *code;
int *lines;
ValueArray constants;
} Chunk;
Add a disassembler (OP_CONSTANT 1.0 ; line 5). You will use it constantly.
Evidence: Hand-emit bytecode for (1.2 + 3.4) / 5.6 and disassemble it.
Milestone 2: Stack-based VM
The VM has a value stack. Each opcode reads its operands from the stack and pushes its result.
typedef struct {
Chunk *chunk;
uint8_t *ip; // instruction pointer
Value stack[256];
Value *stack_top;
} VM;
InterpretResult run(VM *vm) {
for (;;) {
uint8_t instruction = *vm->ip++;
switch (instruction) {
case OP_CONSTANT: {
Value c = vm->chunk->constants.values[*vm->ip++];
push(vm, c);
break;
}
case OP_ADD: {
double b = AS_NUMBER(pop(vm));
double a = AS_NUMBER(pop(vm));
push(vm, NUMBER_VAL(a + b));
break;
}
// ...
case OP_RETURN: return INTERPRET_OK;
}
}
}
Evidence: VM executes the hand-emitted bytecode from Milestone 1 and prints the right answer.
Milestone 3: Single-pass compiler (Pratt parser)
Replace the AST-based parser from the interpreter with a single-pass parser that emits bytecode directly. Use a Pratt parser -- the cleanest way to handle operator precedence in one pass.
typedef enum {
PREC_NONE, PREC_ASSIGNMENT, PREC_OR, PREC_AND,
PREC_EQUALITY, PREC_COMPARISON, PREC_TERM,
PREC_FACTOR, PREC_UNARY, PREC_CALL, PREC_PRIMARY,
} Precedence;
typedef void (*ParseFn)();
typedef struct { ParseFn prefix; ParseFn infix; Precedence precedence; } ParseRule;
ParseRule rules[] = {
[TOKEN_LEFT_PAREN] = {grouping, NULL, PREC_NONE},
[TOKEN_MINUS] = {unary, binary, PREC_TERM},
[TOKEN_PLUS] = {NULL, binary, PREC_TERM},
[TOKEN_STAR] = {NULL, binary, PREC_FACTOR},
// ...
};
Evidence: Compile and run (1 + 2) * 3 - 4 end to end.
Milestone 4: Strings and constants
Add a string type. Intern strings so == is pointer equality. Add hash table for the symbol table.
Evidence: Compile and run print "hello" + " " + "world";.
Milestone 5: Variables and scope
Local variables live on the VM stack at compile-time-known offsets. Globals live in a hash table.
typedef struct { Token name; int depth; } Local;
typedef struct { Local locals[256]; int local_count; int scope_depth; } Compiler;
Evidence: Run a program with nested scopes and shadowing.
Milestone 6: Control flow (jumps)
if, while, for, and, or. Emit OP_JUMP_IF_FALSE with a placeholder offset, then back-patch the offset once you know the target.
int emit_jump(uint8_t instruction) {
emit_byte(instruction);
emit_byte(0xff); emit_byte(0xff);
return current_chunk()->count - 2;
}
void patch_jump(int offset) {
int jump = current_chunk()->count - offset - 2;
current_chunk()->code[offset] = (jump >> 8) & 0xff;
current_chunk()->code[offset + 1] = jump & 0xff;
}
Evidence: Run FizzBuzz.
Milestone 7: Functions and call frames
Each function compiles to its own chunk. Calls push a CallFrame containing the function and its slot of the value stack.
typedef struct { ObjFunction *function; uint8_t *ip; Value *slots; } CallFrame;
Evidence: Run the closure-counter example from the interpreter tutorial. Performance compared to the interpreter: 10-100x faster.
Milestone 8: Closures
Closures capture variables from enclosing scopes via upvalues. The hardest single chapter in Nystrom. Read it twice.
Milestone 9: Garbage collection (mark-sweep)
Tracing GC. Roots: VM stack, globals, open upvalues, current call frames. Mark: visit reachable objects. Sweep: free unmarked.
Nystrom's GC chapter is a small masterpiece. Read it before coding.
Evidence: Run an allocation-heavy program for a few seconds; show memory usage stays bounded.
Milestone 10 (optional): Classes, methods, inheritance
If you want a class-based language. Otherwise stop here.
8. Tests & evidence
| Test | How |
|---|---|
| Correctness -- all interpreter tests | Every program that ran on the interpreter must run on the VM. |
| Bytecode disassembly | Hand-verify that a small program produces the expected ops. |
| Performance | Fibonacci benchmark: VM should be 10-100x faster than the tree-walker. |
| GC correctness | Long-running test with lots of allocation must not OOM. |
| Stack overflow | Deep recursion produces a clean error, not a crash. |
9. Common pitfalls
- Forgetting that single-pass compilation has no AST. You can't easily look ahead. Pratt parsing was invented to make this tractable.
- Back-patching errors. Off-by-one mistakes in jump offsets. Use the disassembler.
- GC during compilation. Strings allocated during parsing are garbage from the GC's perspective. Either pin them or run the GC only at safe points.
- Forgetting to advance the IP. Every opcode that reads operands must advance past them.
- Stack discipline broken. Every binary op pops 2 and pushes 1. If you ever leave the stack in a different state than expected, downstream code corrupts.
- Mixing compile-time and runtime errors. Compile errors come from the parser. Runtime errors come from the VM. They have different error paths.
- No locality of reference for the value array. The VM hot loop is a
switchon opcodes -- compiler optimization opportunities matter (computed gotos, threading). For correctness, ignore. For performance, look up the literature.
10. Extensions
- Threaded interpreter (computed goto). 20-50% speedup on Linux/Clang. Nystrom mentions it.
- JIT compilation. Translate hot bytecode to native at runtime. See V8, LuaJIT.
- Static types -- add a type-check pass after parsing.
- Better GC -- generational, copying, or incremental. Nystrom does mark-sweep.
- Register-based VM -- Lua 5+ uses one. Bytecode is denser; codegen is harder.
- WebAssembly target -- emit
.wasm. Modern, portable, fast. - Self-hosting -- implement your language's compiler in your language. The ultimate proof.
11. Module integration
| Module | What the compiler deepens |
|---|---|
| Sem 4 Module 1 -- C fundamentals | Largest C project in the curriculum (Nystrom's clox is ~2,500 lines). |
| Sem 4 Module 3 -- Computer organization | If you do Path B (assembly target), you live in the target machine's calling conventions and registers. |
| Sem 4 Module 5 -- Abstraction & interpretation | The capstone project. |
| Interpreter tutorial | Direct prerequisite. |
| Memory Allocator tutorial | Your VM's allocator and GC are the natural use case. |
| Web Browser Engine | Multi-pass pipeline architecture transfers. |
| Database (relational) | SQL -> plan transformation is conceptually similar to source -> bytecode. |
12. Portfolio framing
What to publish:
- Source in
src/(lexer, compiler, vm, gc) with clear file-per-pass. - A
tests/folder of.loxprograms and expected outputs. - A
benchmarks/folder comparing your VM to the tree-walker. - A README with a "performance journey" section: what was slow first, what optimizations you made, what's still slow.
Reviewer entry points:
src/vm.c-- the mainrun()interpreter loop.src/compiler.c-- Pratt parser and bytecode emission.src/gc.c-- mark-sweep collector.- README must include: language demo, benchmark numbers, list of known limitations.
This is a serious portfolio piece. A complete clox-equivalent is the kind of project that gets read in interviews.
13. Local source backbone
Use these local chunk sets to deepen the compiler path:
- Writing a C Compiler (
build-your-own/c-compiler-nora-sandler) - From Source Code to Machine Code (
build-your-own/source-to-machine-code-james-smith) - Build Your Own Programming Language (
build-your-own/programming-language-jeffery)
| Local chunks | Use them for | Add to this project |
|---|---|---|
Sandler 002-007 | Minimal C compiler setup, lexing/parsing, AST, first codegen steps | Add a "native compiler side quest" that compiles constants, unary ops, and simple returns. |
Sandler 008-013 | Expressions, operators, precedence, and assembly output | Add precedence and associativity tests that compare AST and generated assembly. |
Sandler 014-024 | Control flow, conditionals, comparisons, and lower-level instruction details | Add branch-generation tests with expected labels and condition-code behavior. |
Smith 001-008 | S-expressions, scopes, VM design, labels, function definitions | Add a stack-VM alternative design note before choosing bytecode versus native code. |
Smith 009-017 | x64 assembly, calls/returns, mmap/ctypes, executable layout, pointers, strings | Add a native-runtime extension for calls, stack frames, and simple heap/string operations. |
Jeffery 061-066 | Intermediate code, labels, temporaries, control-flow codegen | Require an IR dump between AST and bytecode/native code. |
Jeffery 084-101 | Bytecode file format, bytecode interpreter, bytecode generation, native code generation | Add explicit bytecode-format documentation and a disassembler test suite. |
Jeffery 112-117 | Reference counting, heap regions, mark/live-data traversal, reclaiming memory | Expand GC evidence beyond "does not crash": allocation trace, reachable set, and collection report. |
Extra checkpoints from the book chunks
- IR checkpoint: compile a program to AST, IR, bytecode/assembly, and final output; save all four artifacts.
- Calling-convention checkpoint: document stack frame layout for one function call with two parameters and one local.
- Branch checkpoint: generate labels and jumps for
if,while,&&, and||; include negative tests. - Runtime checkpoint: run an allocation-heavy program and prove the collector or ownership strategy reclaims memory.
14. Deep project spec
Project contract
Build a compiler for the interpreter language or a documented subset. The default target is bytecode for a stack VM. The advanced target is native assembly or executable bytes. The compiler must define source grammar, bytecode or IR format, call-frame layout, constants, globals/locals, and error boundaries between compile time and runtime.
Source-backed reading map
| Source ID | Use for | Required output |
|---|---|---|
build-your-own/c-compiler-nora-sandler | native-code staging, C subset, x64 control flow | assembly fixtures and calling-convention note |
build-your-own/source-to-machine-code-james-smith | IR, virtual instructions, machine-code transition | IR dump and native-code checkpoint |
build-your-own/programming-language-jeffery | parser/analyzer foundations | front-end consistency with interpreter |
Milestone map
| Milestone | Deliverable | Tests | Failure case |
|---|---|---|---|
| Bytecode chunk | opcode format, constant pool, disassembler | golden disassembly | invalid opcode trap |
| VM | stack execution for arithmetic | hand-written bytecode fixtures | stack underflow |
| Compiler front end | source to bytecode | expression and statement fixtures | syntax error reports source span |
| Variables/control flow | locals, globals, jumps, loops | FizzBuzz and nested-scope tests | jump patching error caught |
| Functions | call frames, returns, recursion | factorial and closure-counter fixtures | arity mismatch |
| GC or memory policy | mark-sweep or explicit limitation | allocation stress test | leaked/unbounded allocation documented |
| Native checkpoint | arithmetic or return compiled to assembly/bytes | compile-run fixture | wrong stack alignment or ABI violation |
Test matrix
| Test type | Required examples |
|---|---|
| Golden | bytecode disassembly and generated assembly snapshots |
| Runtime | compiled output equals interpreter output |
| Differential | same source through interpreter and compiler |
| Fuzz/property | random valid expressions compile without VM crash |
| Benchmark | interpreter vs bytecode VM on loops/functions |
Design notes required
bytecode.md: opcode table, operand widths, stack effect, invalid-op behavior.vm.md: stack layout, call frames, globals, closures, memory ownership.codegen.md: what is compiled directly, what remains runtime helper logic.native-checkpoint.md: target OS/ABI, registers used, stack alignment, return convention.
Portfolio evidence
Publish one command that prints source, bytecode, and output; one benchmark table against the interpreter; one disassembler transcript; and one documented compiler bug fixed by a regression test.
Source
This tutorial draws from the BYO-X catalog "Programming Language" section. Crafting Interpreters Part III is the canonical reference for the bytecode VM path; Nora Sandler's book is the canonical reference for native compilation.