Build Your Own Compiler

"If you can implement an interpreter, you can implement a compiler -- it's the same pipeline with an extra stage." -- almost true

A compiler turns source code into a lower-level form: bytecode for a VM, assembly for a CPU, or another language. After completing the Interpreter tutorial, this project takes the same source language and compiles it to bytecode that runs 10-100x faster.

This is the most ambitious project in the Systems phase. Plan for 4-8 weeks.

1. Overview & motivation

A compiler pipeline:

source -> [lexer] -> tokens -> [parser] -> AST -> [analyzer] -> typed AST
       -> [IR generator] -> IR -> [optimizer] -> IR
       -> [codegen] -> bytecode or assembly

For this tutorial: target a stack-based virtual machine (Crafting Interpreters Part III) or, more ambitious, x86-64 assembly (compiler with no LLVM).

What you can only learn by building one:

Why stack machines are easier than register machines for codegen, and why production VMs (V8, JVM) use register machines anyway.
Why single-pass compilation is hard once you have forward references and types.
Why garbage collection matters -- your VM needs one.
Why constant folding and dead code elimination are easy optimizations with disproportionate impact.
Why optimization is the largest part of any production compiler -- and the easiest part to skip without losing the point.

2. Where this fits in the degree

Phase: Systems
Semester: 4 (Systems Programming)
Modules deepened: Module 1 (C/Go/Rust), Module 3 (computer organization -- codegen requires understanding the target machine), Module 5 (abstraction & interpretation -- capstone of the module).

Cross-phase relevance:

Builds on the Interpreter tutorial.
Architectural lessons from compiler design (multiple passes, IR, optimization) transfer to the Web Browser Engine (CSS -> style -> layout is also a multi-pass pipeline) and Database (relational) (SQL -> logical plan -> physical plan).

3. Prerequisites

Complete the Interpreter tutorial first. This tutorial assumes you have a working tree-walking interpreter and want to make it faster.
Comfort with bytecode (or assembly, if doing native codegen).
Basic familiarity with stack machines (JVM, CPython bytecode, WebAssembly).

4. Theory & research

Required reading

Robert Nystrom, Crafting Interpreters, Part III -- craftinginterpreters.com. Builds a bytecode VM compiler for Lox in C. Roughly 2,500 lines. â the canonical text for this project.
Thorsten Ball, Writing A Compiler In Go -- compilerbook.com. Builds on his interpreter book. Stack-based VM. Excellent.

Strongly recommended

Aho, Sethi, Ullman, Compilers: Principles, Techniques, and Tools ("Dragon Book", 2nd edition) -- Chapters 6 (intermediate code), 8 (code generation), 9-10 (optimization). The standard reference.
Appel, Modern Compiler Implementation in {C, Java, ML} -- alternative textbook. Better on register allocation.
Engineering a Compiler -- Cooper & Torczon. The third major textbook. Strong on optimization.

For specific topics

Lattner & Adve, "LLVM: A Compilation Framework" (2004) -- the canonical SSA-based compiler infrastructure.
Aycock, "A Brief History of Just-In-Time" (ACM, 2003) -- short paper on JIT compilation.

What to skip on the first pass

LLVM. Wonderful for production work; obscures what you're trying to learn.
Full optimization passes (loop unrolling, vectorization). Stick to constant folding, dead-code, peephole.

5. Curated tutorial list (from BYO-X)

This corresponds to the "Programming Language" section, the higher-effort entries (interpreter-only is the previous tutorial):

Assembly: Jonesforth -- github.com/nornagon/jonesforth -- a self-hosting Forth in x86 assembly. Heavy stuff; for very ambitious learners.
C: Baby's First Garbage Collector -- Robert Nystrom -- the GC half of the compiler project.
C: Writing a Simple Garbage Collector in C
C: A C & x86 version of the "Let's Build a Compiler" by Jack Crenshaw -- Jack Crenshaw's original series (Pascal, 1988) and modern ports
C: A journey explaining how to build a compiler from scratch
C++: Writing Your Own Toy Compiler Using Flex
C++: How to Create a Compiler [video]
C++: Kaleidoscope: Implementing a Language with LLVM -- LLVM tutorial -- if you want to learn LLVM specifically
F#: Understanding Parser Combinators
Elixir: Demystifying compilers by writing your own [video]
Go: The Super Tiny Compiler -- jamiebuilds/the-super-tiny-compiler -- Lisp-to-C in 200 lines, heavily commented
Go: Lexical Scanning in Go [video] -- Rob Pike's talk
Haskell: Let's Build a Compiler, Write You a Haskell, Write Yourself a Scheme in 48 Hours, Write You A Scheme
Java: Crafting interpreters: A handbook for making programming languages -- Part III â recommended primary
Java: Creating JVM Language
JavaScript: The Super Tiny Compiler
OCaml: Writing a C Compiler -- Nora Sandler's book â for "compile to assembly" path
OCaml: Writing a Lisp, the series
Pascal: Let's Build a Compiler -- Jack Crenshaw's original
Python: From Source Code To Machine Code: Build Your Own Compiler From Scratch -- covers x86 codegen
Racket: Beautiful Racket: How to make your own programming languages with Racket -- beautifulracket.com
Ruby: A Compiler From Scratch
Rust: Learning Parser Combinators With Rust
TypeScript: Build your own WebAssembly Compiler

6. Recommended primary path

There are two clearly distinct paths. Pick one before starting.

Path A: Bytecode VM (recommended)

Robert Nystrom, Crafting Interpreters, Part III (Java/C). You compile Lox to bytecode and execute on a stack-based VM you also build. Includes a mark-sweep GC, hash table, single-pass Pratt parser. ~2,500 lines of C.

This is the right path for most learners. You get a real, executable language with realistic performance.

Path B: Compile to native assembly

Nora Sandler, Writing a C Compiler (OCaml/Python). Compiles a subset of C to x86-64 assembly. Two-volume book. Very rigorous.

This is the right path if you want to engage with register allocation, calling conventions, and the actual machine. Substantially more work.

For this degree, Path A is the default. Path B is an excellent capstone-tier project if you have the time.

7. Implementation milestones (Path A: bytecode VM, following Nystrom Part III)

Milestone 1: Chunk and OpCodes

A chunk is a flat byte array of opcodes and operands. Operands index into a constant pool.

typedef enum {
    OP_CONSTANT,   // operand: constant index
    OP_ADD, OP_SUBTRACT, OP_MULTIPLY, OP_DIVIDE,
    OP_NEGATE,
    OP_RETURN,
} OpCode;

typedef struct {
    int count, capacity;
    uint8_t *code;
    int *lines;
    ValueArray constants;
} Chunk;

Add a disassembler (OP_CONSTANT 1.0 ; line 5). You will use it constantly.

Evidence: Hand-emit bytecode for (1.2 + 3.4) / 5.6 and disassemble it.

Milestone 2: Stack-based VM

The VM has a value stack. Each opcode reads its operands from the stack and pushes its result.

typedef struct {
    Chunk *chunk;
    uint8_t *ip;          // instruction pointer
    Value stack[256];
    Value *stack_top;
} VM;

InterpretResult run(VM *vm) {
    for (;;) {
        uint8_t instruction = *vm->ip++;
        switch (instruction) {
            case OP_CONSTANT: {
                Value c = vm->chunk->constants.values[*vm->ip++];
                push(vm, c);
                break;
            }
            case OP_ADD: {
                double b = AS_NUMBER(pop(vm));
                double a = AS_NUMBER(pop(vm));
                push(vm, NUMBER_VAL(a + b));
                break;
            }
            // ...
            case OP_RETURN: return INTERPRET_OK;
        }
    }
}

Evidence: VM executes the hand-emitted bytecode from Milestone 1 and prints the right answer.

Milestone 3: Single-pass compiler (Pratt parser)

Replace the AST-based parser from the interpreter with a single-pass parser that emits bytecode directly. Use a Pratt parser -- the cleanest way to handle operator precedence in one pass.

typedef enum {
    PREC_NONE, PREC_ASSIGNMENT, PREC_OR, PREC_AND,
    PREC_EQUALITY, PREC_COMPARISON, PREC_TERM,
    PREC_FACTOR, PREC_UNARY, PREC_CALL, PREC_PRIMARY,
} Precedence;

typedef void (*ParseFn)();
typedef struct { ParseFn prefix; ParseFn infix; Precedence precedence; } ParseRule;

ParseRule rules[] = {
    [TOKEN_LEFT_PAREN] = {grouping, NULL,   PREC_NONE},
    [TOKEN_MINUS]      = {unary,    binary, PREC_TERM},
    [TOKEN_PLUS]       = {NULL,     binary, PREC_TERM},
    [TOKEN_STAR]       = {NULL,     binary, PREC_FACTOR},
    // ...
};

Evidence: Compile and run (1 + 2) * 3 - 4 end to end.

Milestone 4: Strings and constants

Add a string type. Intern strings so == is pointer equality. Add hash table for the symbol table.

Evidence: Compile and run print "hello" + " " + "world";.

Milestone 5: Variables and scope

Local variables live on the VM stack at compile-time-known offsets. Globals live in a hash table.

typedef struct { Token name; int depth; } Local;
typedef struct { Local locals[256]; int local_count; int scope_depth; } Compiler;

Evidence: Run a program with nested scopes and shadowing.

Milestone 6: Control flow (jumps)

if, while, for, and, or. Emit OP_JUMP_IF_FALSE with a placeholder offset, then back-patch the offset once you know the target.

int emit_jump(uint8_t instruction) {
    emit_byte(instruction);
    emit_byte(0xff); emit_byte(0xff);
    return current_chunk()->count - 2;
}

void patch_jump(int offset) {
    int jump = current_chunk()->count - offset - 2;
    current_chunk()->code[offset]     = (jump >> 8) & 0xff;
    current_chunk()->code[offset + 1] = jump & 0xff;
}

Evidence: Run FizzBuzz.

Milestone 7: Functions and call frames

Each function compiles to its own chunk. Calls push a CallFrame containing the function and its slot of the value stack.

typedef struct { ObjFunction *function; uint8_t *ip; Value *slots; } CallFrame;

Evidence: Run the closure-counter example from the interpreter tutorial. Performance compared to the interpreter: 10-100x faster.

Milestone 8: Closures

Closures capture variables from enclosing scopes via upvalues. The hardest single chapter in Nystrom. Read it twice.

Milestone 9: Garbage collection (mark-sweep)

Tracing GC. Roots: VM stack, globals, open upvalues, current call frames. Mark: visit reachable objects. Sweep: free unmarked.

Nystrom's GC chapter is a small masterpiece. Read it before coding.

Evidence: Run an allocation-heavy program for a few seconds; show memory usage stays bounded.

Milestone 10 (optional): Classes, methods, inheritance

If you want a class-based language. Otherwise stop here.

8. Tests & evidence

Test	How
Correctness -- all interpreter tests	Every program that ran on the interpreter must run on the VM.
Bytecode disassembly	Hand-verify that a small program produces the expected ops.
Performance	Fibonacci benchmark: VM should be 10-100x faster than the tree-walker.
GC correctness	Long-running test with lots of allocation must not OOM.
Stack overflow	Deep recursion produces a clean error, not a crash.

9. Common pitfalls

Forgetting that single-pass compilation has no AST. You can't easily look ahead. Pratt parsing was invented to make this tractable.
Back-patching errors. Off-by-one mistakes in jump offsets. Use the disassembler.
GC during compilation. Strings allocated during parsing are garbage from the GC's perspective. Either pin them or run the GC only at safe points.
Forgetting to advance the IP. Every opcode that reads operands must advance past them.
Stack discipline broken. Every binary op pops 2 and pushes 1. If you ever leave the stack in a different state than expected, downstream code corrupts.
Mixing compile-time and runtime errors. Compile errors come from the parser. Runtime errors come from the VM. They have different error paths.
No locality of reference for the value array. The VM hot loop is a switch on opcodes -- compiler optimization opportunities matter (computed gotos, threading). For correctness, ignore. For performance, look up the literature.

10. Extensions

Threaded interpreter (computed goto). 20-50% speedup on Linux/Clang. Nystrom mentions it.
JIT compilation. Translate hot bytecode to native at runtime. See V8, LuaJIT.
Static types -- add a type-check pass after parsing.
Better GC -- generational, copying, or incremental. Nystrom does mark-sweep.
Register-based VM -- Lua 5+ uses one. Bytecode is denser; codegen is harder.
WebAssembly target -- emit .wasm. Modern, portable, fast.
Self-hosting -- implement your language's compiler in your language. The ultimate proof.

11. Module integration

Module	What the compiler deepens
Sem 4 Module 1 -- C fundamentals	Largest C project in the curriculum (Nystrom's clox is ~2,500 lines).
Sem 4 Module 3 -- Computer organization	If you do Path B (assembly target), you live in the target machine's calling conventions and registers.
Sem 4 Module 5 -- Abstraction & interpretation	The capstone project.
Interpreter tutorial	Direct prerequisite.
Memory Allocator tutorial	Your VM's allocator and GC are the natural use case.
Web Browser Engine	Multi-pass pipeline architecture transfers.
Database (relational)	SQL -> plan transformation is conceptually similar to source -> bytecode.

12. Portfolio framing

What to publish:

Source in src/ (lexer, compiler, vm, gc) with clear file-per-pass.
A tests/ folder of .lox programs and expected outputs.
A benchmarks/ folder comparing your VM to the tree-walker.
A README with a "performance journey" section: what was slow first, what optimizations you made, what's still slow.

Reviewer entry points:

src/vm.c -- the main run() interpreter loop.
src/compiler.c -- Pratt parser and bytecode emission.
src/gc.c -- mark-sweep collector.
README must include: language demo, benchmark numbers, list of known limitations.

This is a serious portfolio piece. A complete clox-equivalent is the kind of project that gets read in interviews.

13. Local source backbone

Use these local chunk sets to deepen the compiler path:

Writing a C Compiler (build-your-own/c-compiler-nora-sandler)
From Source Code to Machine Code (build-your-own/source-to-machine-code-james-smith)
Build Your Own Programming Language (build-your-own/programming-language-jeffery)

Local chunks	Use them for	Add to this project
Sandler `002`-`007`	Minimal C compiler setup, lexing/parsing, AST, first codegen steps	Add a "native compiler side quest" that compiles constants, unary ops, and simple returns.
Sandler `008`-`013`	Expressions, operators, precedence, and assembly output	Add precedence and associativity tests that compare AST and generated assembly.
Sandler `014`-`024`	Control flow, conditionals, comparisons, and lower-level instruction details	Add branch-generation tests with expected labels and condition-code behavior.
Smith `001`-`008`	S-expressions, scopes, VM design, labels, function definitions	Add a stack-VM alternative design note before choosing bytecode versus native code.
Smith `009`-`017`	x64 assembly, calls/returns, mmap/ctypes, executable layout, pointers, strings	Add a native-runtime extension for calls, stack frames, and simple heap/string operations.
Jeffery `061`-`066`	Intermediate code, labels, temporaries, control-flow codegen	Require an IR dump between AST and bytecode/native code.
Jeffery `084`-`101`	Bytecode file format, bytecode interpreter, bytecode generation, native code generation	Add explicit bytecode-format documentation and a disassembler test suite.
Jeffery `112`-`117`	Reference counting, heap regions, mark/live-data traversal, reclaiming memory	Expand GC evidence beyond "does not crash": allocation trace, reachable set, and collection report.

Extra checkpoints from the book chunks

IR checkpoint: compile a program to AST, IR, bytecode/assembly, and final output; save all four artifacts.
Calling-convention checkpoint: document stack frame layout for one function call with two parameters and one local.
Branch checkpoint: generate labels and jumps for if, while, &&, and ||; include negative tests.
Runtime checkpoint: run an allocation-heavy program and prove the collector or ownership strategy reclaims memory.

14. Deep project spec

Project contract

Build a compiler for the interpreter language or a documented subset. The default target is bytecode for a stack VM. The advanced target is native assembly or executable bytes. The compiler must define source grammar, bytecode or IR format, call-frame layout, constants, globals/locals, and error boundaries between compile time and runtime.

Source-backed reading map

Source ID	Use for	Required output
`build-your-own/c-compiler-nora-sandler`	native-code staging, C subset, x64 control flow	assembly fixtures and calling-convention note
`build-your-own/source-to-machine-code-james-smith`	IR, virtual instructions, machine-code transition	IR dump and native-code checkpoint
`build-your-own/programming-language-jeffery`	parser/analyzer foundations	front-end consistency with interpreter

Milestone map

Milestone	Deliverable	Tests	Failure case
Bytecode chunk	opcode format, constant pool, disassembler	golden disassembly	invalid opcode trap
VM	stack execution for arithmetic	hand-written bytecode fixtures	stack underflow
Compiler front end	source to bytecode	expression and statement fixtures	syntax error reports source span
Variables/control flow	locals, globals, jumps, loops	FizzBuzz and nested-scope tests	jump patching error caught
Functions	call frames, returns, recursion	factorial and closure-counter fixtures	arity mismatch
GC or memory policy	mark-sweep or explicit limitation	allocation stress test	leaked/unbounded allocation documented
Native checkpoint	arithmetic or return compiled to assembly/bytes	compile-run fixture	wrong stack alignment or ABI violation

Test matrix

Test type	Required examples
Golden	bytecode disassembly and generated assembly snapshots
Runtime	compiled output equals interpreter output
Differential	same source through interpreter and compiler
Fuzz/property	random valid expressions compile without VM crash
Benchmark	interpreter vs bytecode VM on loops/functions

Design notes required

bytecode.md: opcode table, operand widths, stack effect, invalid-op behavior.
vm.md: stack layout, call frames, globals, closures, memory ownership.
codegen.md: what is compiled directly, what remains runtime helper logic.
native-checkpoint.md: target OS/ABI, registers used, stack alignment, return convention.

Portfolio evidence

Publish one command that prints source, bytecode, and output; one benchmark table against the interpreter; one disassembler transcript; and one documented compiler bug fixed by a regression test.

Source

This tutorial draws from the BYO-X catalog "Programming Language" section. Crafting Interpreters Part III is the canonical reference for the bytecode VM path; Nora Sandler's book is the canonical reference for native compilation.

1. Overview & motivation​

2. Where this fits in the degree​

3. Prerequisites​

4. Theory & research​

Required reading​

Strongly recommended​

For specific topics​

What to skip on the first pass​

5. Curated tutorial list (from BYO-X)​

6. Recommended primary path​

Path A: Bytecode VM (recommended)​

Path B: Compile to native assembly​

7. Implementation milestones (Path A: bytecode VM, following Nystrom Part III)​

Milestone 1: Chunk and OpCodes​

Milestone 2: Stack-based VM​

Milestone 3: Single-pass compiler (Pratt parser)​

Milestone 4: Strings and constants​

Milestone 5: Variables and scope​

Milestone 6: Control flow (jumps)​

Milestone 7: Functions and call frames​

Milestone 8: Closures​

Milestone 9: Garbage collection (mark-sweep)​

Milestone 10 (optional): Classes, methods, inheritance​

8. Tests & evidence​

9. Common pitfalls​

10. Extensions​

11. Module integration​

12. Portfolio framing​

13. Local source backbone​

Extra checkpoints from the book chunks​

14. Deep project spec​

Project contract​

Source-backed reading map​

Milestone map​

Test matrix​

Design notes required​

Portfolio evidence​

Source​