Skip to main content

Build Your Own Compiler

"If you can implement an interpreter, you can implement a compiler -- it's the same pipeline with an extra stage." -- almost true

A compiler turns source code into a lower-level form: bytecode for a VM, assembly for a CPU, or another language. After completing the Interpreter tutorial, this project takes the same source language and compiles it to bytecode that runs 10-100x faster.

This is the most ambitious project in the Systems phase. Plan for 4-8 weeks.


1. Overview & motivation

A compiler pipeline:

source -> [lexer] -> tokens -> [parser] -> AST -> [analyzer] -> typed AST
-> [IR generator] -> IR -> [optimizer] -> IR
-> [codegen] -> bytecode or assembly

For this tutorial: target a stack-based virtual machine (Crafting Interpreters Part III) or, more ambitious, x86-64 assembly (compiler with no LLVM).

What you can only learn by building one:

  • Why stack machines are easier than register machines for codegen, and why production VMs (V8, JVM) use register machines anyway.
  • Why single-pass compilation is hard once you have forward references and types.
  • Why garbage collection matters -- your VM needs one.
  • Why constant folding and dead code elimination are easy optimizations with disproportionate impact.
  • Why optimization is the largest part of any production compiler -- and the easiest part to skip without losing the point.

2. Where this fits in the degree

  • Phase: Systems
  • Semester: 4 (Systems Programming)
  • Modules deepened: Module 1 (C/Go/Rust), Module 3 (computer organization -- codegen requires understanding the target machine), Module 5 (abstraction & interpretation -- capstone of the module).

Cross-phase relevance:


3. Prerequisites

  • Complete the Interpreter tutorial first. This tutorial assumes you have a working tree-walking interpreter and want to make it faster.
  • Comfort with bytecode (or assembly, if doing native codegen).
  • Basic familiarity with stack machines (JVM, CPython bytecode, WebAssembly).

4. Theory & research

Required reading

  • Robert Nystrom, Crafting Interpreters, Part III -- craftinginterpreters.com. Builds a bytecode VM compiler for Lox in C. Roughly 2,500 lines. ⭐ the canonical text for this project.
  • Thorsten Ball, Writing A Compiler In Go -- compilerbook.com. Builds on his interpreter book. Stack-based VM. Excellent.
  • Aho, Sethi, Ullman, Compilers: Principles, Techniques, and Tools ("Dragon Book", 2nd edition) -- Chapters 6 (intermediate code), 8 (code generation), 9-10 (optimization). The standard reference.
  • Appel, Modern Compiler Implementation in {C, Java, ML} -- alternative textbook. Better on register allocation.
  • Engineering a Compiler -- Cooper & Torczon. The third major textbook. Strong on optimization.

For specific topics

  • Lattner & Adve, "LLVM: A Compilation Framework" (2004) -- the canonical SSA-based compiler infrastructure.
  • Aycock, "A Brief History of Just-In-Time" (ACM, 2003) -- short paper on JIT compilation.

What to skip on the first pass

  • LLVM. Wonderful for production work; obscures what you're trying to learn.
  • Full optimization passes (loop unrolling, vectorization). Stick to constant folding, dead-code, peephole.

5. Curated tutorial list (from BYO-X)

This corresponds to the "Programming Language" section, the higher-effort entries (interpreter-only is the previous tutorial):

  • Assembly: Jonesforth -- github.com/nornagon/jonesforth -- a self-hosting Forth in x86 assembly. Heavy stuff; for very ambitious learners.
  • C: Baby's First Garbage Collector -- Robert Nystrom -- the GC half of the compiler project.
  • C: Writing a Simple Garbage Collector in C
  • C: A C & x86 version of the "Let's Build a Compiler" by Jack Crenshaw -- Jack Crenshaw's original series (Pascal, 1988) and modern ports
  • C: A journey explaining how to build a compiler from scratch
  • C++: Writing Your Own Toy Compiler Using Flex
  • C++: How to Create a Compiler [video]
  • C++: Kaleidoscope: Implementing a Language with LLVM -- LLVM tutorial -- if you want to learn LLVM specifically
  • F#: Understanding Parser Combinators
  • Elixir: Demystifying compilers by writing your own [video]
  • Go: The Super Tiny Compiler -- jamiebuilds/the-super-tiny-compiler -- Lisp-to-C in 200 lines, heavily commented
  • Go: Lexical Scanning in Go [video] -- Rob Pike's talk
  • Haskell: Let's Build a Compiler, Write You a Haskell, Write Yourself a Scheme in 48 Hours, Write You A Scheme
  • Java: Crafting interpreters: A handbook for making programming languages -- Part III ⭐ recommended primary
  • Java: Creating JVM Language
  • JavaScript: The Super Tiny Compiler
  • OCaml: Writing a C Compiler -- Nora Sandler's book ⭐ for "compile to assembly" path
  • OCaml: Writing a Lisp, the series
  • Pascal: Let's Build a Compiler -- Jack Crenshaw's original
  • Python: From Source Code To Machine Code: Build Your Own Compiler From Scratch -- covers x86 codegen
  • Racket: Beautiful Racket: How to make your own programming languages with Racket -- beautifulracket.com
  • Ruby: A Compiler From Scratch
  • Rust: Learning Parser Combinators With Rust
  • TypeScript: Build your own WebAssembly Compiler

There are two clearly distinct paths. Pick one before starting.

Robert Nystrom, Crafting Interpreters, Part III (Java/C). You compile Lox to bytecode and execute on a stack-based VM you also build. Includes a mark-sweep GC, hash table, single-pass Pratt parser. ~2,500 lines of C.

This is the right path for most learners. You get a real, executable language with realistic performance.

Path B: Compile to native assembly

Nora Sandler, Writing a C Compiler (OCaml/Python). Compiles a subset of C to x86-64 assembly. Two-volume book. Very rigorous.

This is the right path if you want to engage with register allocation, calling conventions, and the actual machine. Substantially more work.

For this degree, Path A is the default. Path B is an excellent capstone-tier project if you have the time.


7. Implementation milestones (Path A: bytecode VM, following Nystrom Part III)

Milestone 1: Chunk and OpCodes

A chunk is a flat byte array of opcodes and operands. Operands index into a constant pool.

typedef enum {
OP_CONSTANT, // operand: constant index
OP_ADD, OP_SUBTRACT, OP_MULTIPLY, OP_DIVIDE,
OP_NEGATE,
OP_RETURN,
} OpCode;

typedef struct {
int count, capacity;
uint8_t *code;
int *lines;
ValueArray constants;
} Chunk;

Add a disassembler (OP_CONSTANT 1.0 ; line 5). You will use it constantly.

Evidence: Hand-emit bytecode for (1.2 + 3.4) / 5.6 and disassemble it.

Milestone 2: Stack-based VM

The VM has a value stack. Each opcode reads its operands from the stack and pushes its result.

typedef struct {
Chunk *chunk;
uint8_t *ip; // instruction pointer
Value stack[256];
Value *stack_top;
} VM;

InterpretResult run(VM *vm) {
for (;;) {
uint8_t instruction = *vm->ip++;
switch (instruction) {
case OP_CONSTANT: {
Value c = vm->chunk->constants.values[*vm->ip++];
push(vm, c);
break;
}
case OP_ADD: {
double b = AS_NUMBER(pop(vm));
double a = AS_NUMBER(pop(vm));
push(vm, NUMBER_VAL(a + b));
break;
}
// ...
case OP_RETURN: return INTERPRET_OK;
}
}
}

Evidence: VM executes the hand-emitted bytecode from Milestone 1 and prints the right answer.

Milestone 3: Single-pass compiler (Pratt parser)

Replace the AST-based parser from the interpreter with a single-pass parser that emits bytecode directly. Use a Pratt parser -- the cleanest way to handle operator precedence in one pass.

typedef enum {
PREC_NONE, PREC_ASSIGNMENT, PREC_OR, PREC_AND,
PREC_EQUALITY, PREC_COMPARISON, PREC_TERM,
PREC_FACTOR, PREC_UNARY, PREC_CALL, PREC_PRIMARY,
} Precedence;

typedef void (*ParseFn)();
typedef struct { ParseFn prefix; ParseFn infix; Precedence precedence; } ParseRule;

ParseRule rules[] = {
[TOKEN_LEFT_PAREN] = {grouping, NULL, PREC_NONE},
[TOKEN_MINUS] = {unary, binary, PREC_TERM},
[TOKEN_PLUS] = {NULL, binary, PREC_TERM},
[TOKEN_STAR] = {NULL, binary, PREC_FACTOR},
// ...
};

Evidence: Compile and run (1 + 2) * 3 - 4 end to end.

Milestone 4: Strings and constants

Add a string type. Intern strings so == is pointer equality. Add hash table for the symbol table.

Evidence: Compile and run print "hello" + " " + "world";.

Milestone 5: Variables and scope

Local variables live on the VM stack at compile-time-known offsets. Globals live in a hash table.

typedef struct { Token name; int depth; } Local;
typedef struct { Local locals[256]; int local_count; int scope_depth; } Compiler;

Evidence: Run a program with nested scopes and shadowing.

Milestone 6: Control flow (jumps)

if, while, for, and, or. Emit OP_JUMP_IF_FALSE with a placeholder offset, then back-patch the offset once you know the target.

int emit_jump(uint8_t instruction) {
emit_byte(instruction);
emit_byte(0xff); emit_byte(0xff);
return current_chunk()->count - 2;
}

void patch_jump(int offset) {
int jump = current_chunk()->count - offset - 2;
current_chunk()->code[offset] = (jump >> 8) & 0xff;
current_chunk()->code[offset + 1] = jump & 0xff;
}

Evidence: Run FizzBuzz.

Milestone 7: Functions and call frames

Each function compiles to its own chunk. Calls push a CallFrame containing the function and its slot of the value stack.

typedef struct { ObjFunction *function; uint8_t *ip; Value *slots; } CallFrame;

Evidence: Run the closure-counter example from the interpreter tutorial. Performance compared to the interpreter: 10-100x faster.

Milestone 8: Closures

Closures capture variables from enclosing scopes via upvalues. The hardest single chapter in Nystrom. Read it twice.

Milestone 9: Garbage collection (mark-sweep)

Tracing GC. Roots: VM stack, globals, open upvalues, current call frames. Mark: visit reachable objects. Sweep: free unmarked.

Nystrom's GC chapter is a small masterpiece. Read it before coding.

Evidence: Run an allocation-heavy program for a few seconds; show memory usage stays bounded.

Milestone 10 (optional): Classes, methods, inheritance

If you want a class-based language. Otherwise stop here.


8. Tests & evidence

TestHow
Correctness -- all interpreter testsEvery program that ran on the interpreter must run on the VM.
Bytecode disassemblyHand-verify that a small program produces the expected ops.
PerformanceFibonacci benchmark: VM should be 10-100x faster than the tree-walker.
GC correctnessLong-running test with lots of allocation must not OOM.
Stack overflowDeep recursion produces a clean error, not a crash.

9. Common pitfalls

  • Forgetting that single-pass compilation has no AST. You can't easily look ahead. Pratt parsing was invented to make this tractable.
  • Back-patching errors. Off-by-one mistakes in jump offsets. Use the disassembler.
  • GC during compilation. Strings allocated during parsing are garbage from the GC's perspective. Either pin them or run the GC only at safe points.
  • Forgetting to advance the IP. Every opcode that reads operands must advance past them.
  • Stack discipline broken. Every binary op pops 2 and pushes 1. If you ever leave the stack in a different state than expected, downstream code corrupts.
  • Mixing compile-time and runtime errors. Compile errors come from the parser. Runtime errors come from the VM. They have different error paths.
  • No locality of reference for the value array. The VM hot loop is a switch on opcodes -- compiler optimization opportunities matter (computed gotos, threading). For correctness, ignore. For performance, look up the literature.

10. Extensions

  • Threaded interpreter (computed goto). 20-50% speedup on Linux/Clang. Nystrom mentions it.
  • JIT compilation. Translate hot bytecode to native at runtime. See V8, LuaJIT.
  • Static types -- add a type-check pass after parsing.
  • Better GC -- generational, copying, or incremental. Nystrom does mark-sweep.
  • Register-based VM -- Lua 5+ uses one. Bytecode is denser; codegen is harder.
  • WebAssembly target -- emit .wasm. Modern, portable, fast.
  • Self-hosting -- implement your language's compiler in your language. The ultimate proof.

11. Module integration

ModuleWhat the compiler deepens
Sem 4 Module 1 -- C fundamentalsLargest C project in the curriculum (Nystrom's clox is ~2,500 lines).
Sem 4 Module 3 -- Computer organizationIf you do Path B (assembly target), you live in the target machine's calling conventions and registers.
Sem 4 Module 5 -- Abstraction & interpretationThe capstone project.
Interpreter tutorialDirect prerequisite.
Memory Allocator tutorialYour VM's allocator and GC are the natural use case.
Web Browser EngineMulti-pass pipeline architecture transfers.
Database (relational)SQL -> plan transformation is conceptually similar to source -> bytecode.

12. Portfolio framing

What to publish:

  • Source in src/ (lexer, compiler, vm, gc) with clear file-per-pass.
  • A tests/ folder of .lox programs and expected outputs.
  • A benchmarks/ folder comparing your VM to the tree-walker.
  • A README with a "performance journey" section: what was slow first, what optimizations you made, what's still slow.

Reviewer entry points:

  • src/vm.c -- the main run() interpreter loop.
  • src/compiler.c -- Pratt parser and bytecode emission.
  • src/gc.c -- mark-sweep collector.
  • README must include: language demo, benchmark numbers, list of known limitations.

This is a serious portfolio piece. A complete clox-equivalent is the kind of project that gets read in interviews.


13. Local source backbone

Use these local chunk sets to deepen the compiler path:

  • Writing a C Compiler (build-your-own/c-compiler-nora-sandler)
  • From Source Code to Machine Code (build-your-own/source-to-machine-code-james-smith)
  • Build Your Own Programming Language (build-your-own/programming-language-jeffery)
Local chunksUse them forAdd to this project
Sandler 002-007Minimal C compiler setup, lexing/parsing, AST, first codegen stepsAdd a "native compiler side quest" that compiles constants, unary ops, and simple returns.
Sandler 008-013Expressions, operators, precedence, and assembly outputAdd precedence and associativity tests that compare AST and generated assembly.
Sandler 014-024Control flow, conditionals, comparisons, and lower-level instruction detailsAdd branch-generation tests with expected labels and condition-code behavior.
Smith 001-008S-expressions, scopes, VM design, labels, function definitionsAdd a stack-VM alternative design note before choosing bytecode versus native code.
Smith 009-017x64 assembly, calls/returns, mmap/ctypes, executable layout, pointers, stringsAdd a native-runtime extension for calls, stack frames, and simple heap/string operations.
Jeffery 061-066Intermediate code, labels, temporaries, control-flow codegenRequire an IR dump between AST and bytecode/native code.
Jeffery 084-101Bytecode file format, bytecode interpreter, bytecode generation, native code generationAdd explicit bytecode-format documentation and a disassembler test suite.
Jeffery 112-117Reference counting, heap regions, mark/live-data traversal, reclaiming memoryExpand GC evidence beyond "does not crash": allocation trace, reachable set, and collection report.

Extra checkpoints from the book chunks

  1. IR checkpoint: compile a program to AST, IR, bytecode/assembly, and final output; save all four artifacts.
  2. Calling-convention checkpoint: document stack frame layout for one function call with two parameters and one local.
  3. Branch checkpoint: generate labels and jumps for if, while, &&, and ||; include negative tests.
  4. Runtime checkpoint: run an allocation-heavy program and prove the collector or ownership strategy reclaims memory.

14. Deep project spec

Project contract

Build a compiler for the interpreter language or a documented subset. The default target is bytecode for a stack VM. The advanced target is native assembly or executable bytes. The compiler must define source grammar, bytecode or IR format, call-frame layout, constants, globals/locals, and error boundaries between compile time and runtime.

Source-backed reading map

Source IDUse forRequired output
build-your-own/c-compiler-nora-sandlernative-code staging, C subset, x64 control flowassembly fixtures and calling-convention note
build-your-own/source-to-machine-code-james-smithIR, virtual instructions, machine-code transitionIR dump and native-code checkpoint
build-your-own/programming-language-jefferyparser/analyzer foundationsfront-end consistency with interpreter

Milestone map

MilestoneDeliverableTestsFailure case
Bytecode chunkopcode format, constant pool, disassemblergolden disassemblyinvalid opcode trap
VMstack execution for arithmetichand-written bytecode fixturesstack underflow
Compiler front endsource to bytecodeexpression and statement fixturessyntax error reports source span
Variables/control flowlocals, globals, jumps, loopsFizzBuzz and nested-scope testsjump patching error caught
Functionscall frames, returns, recursionfactorial and closure-counter fixturesarity mismatch
GC or memory policymark-sweep or explicit limitationallocation stress testleaked/unbounded allocation documented
Native checkpointarithmetic or return compiled to assembly/bytescompile-run fixturewrong stack alignment or ABI violation

Test matrix

Test typeRequired examples
Goldenbytecode disassembly and generated assembly snapshots
Runtimecompiled output equals interpreter output
Differentialsame source through interpreter and compiler
Fuzz/propertyrandom valid expressions compile without VM crash
Benchmarkinterpreter vs bytecode VM on loops/functions

Design notes required

  • bytecode.md: opcode table, operand widths, stack effect, invalid-op behavior.
  • vm.md: stack layout, call frames, globals, closures, memory ownership.
  • codegen.md: what is compiled directly, what remains runtime helper logic.
  • native-checkpoint.md: target OS/ABI, registers used, stack alignment, return convention.

Portfolio evidence

Publish one command that prints source, bytecode, and output; one benchmark table against the interpreter; one disassembler transcript; and one documented compiler bug fixed by a regression test.


Source

This tutorial draws from the BYO-X catalog "Programming Language" section. Crafting Interpreters Part III is the canonical reference for the bytecode VM path; Nora Sandler's book is the canonical reference for native compilation.