Skip to main content

Build Your Own Interpreter (Tree-Walking)

"A programming language is a structured way to express computation. An interpreter is a program that takes such expression and produces the computation."

A tree-walking interpreter is the cleanest entry point into language implementation. You'll build a lexer, a parser, an AST, an evaluator, and a small but real programming language with variables, functions, closures, and control flow -- in roughly 1,500 lines of code.

Then in the Compiler tutorial, you'll graduate to bytecode and a virtual machine.


1. Overview & motivation

An interpreter is a pipeline:

source text -> [lexer] -> tokens -> [parser] -> AST -> [evaluator] -> result

You'll implement each stage. The reward is enormous: programming languages stop being magic. You will read other languages' source code with much more comfort.

What you can only learn by building one:

  • Why operator precedence is a real implementation problem (Pratt parsing or recursive descent with binding powers).
  • Why closures require careful environment design -- they are functions that capture their lexical scope.
  • Why first-class functions simplify the language dramatically.
  • Why tree-walking is too slow for real use, motivating the bytecode VM in the Compiler tutorial.
  • Why most "scripting languages" share the same fundamental shape under the hood.

2. Where this fits in the degree

  • Phase: Systems
  • Semester: 4 (Systems Programming)
  • Modules deepened: Module 1 (C/Go/Rust fundamentals) -- large project with multiple files. Module 5 (abstraction & interpretation) -- this is the module's project.

Cross-phase relevance:


3. Prerequisites

  • Comfort with a single language deeply (Go, Rust, C, OCaml, or Java).
  • Recursive data structures: trees, especially.
  • Basic familiarity with grammars (BNF/EBNF). Optional.

You do not need a compilers course. Both recommended tutorials are self-contained.


4. Theory & research

Required reading

  • Robert Nystrom, Crafting Interpreters -- craftinginterpreters.com (free, also in print). Part II (Java tree-walking) is exactly this project. ⭐ the canonical text.
  • Thorsten Ball, Writing An Interpreter In Go -- interpreterbook.com. The Go counterpart. Smaller, more concentrated. Read either.
  • Aho, Sethi, Ullman, Compilers: Principles, Techniques, and Tools ("Dragon Book") -- Chapters 3 (lexical analysis), 4 (parsing). Reference depth, less readable than Nystrom/Ball.
  • Pratt, "Top Down Operator Precedence" (1973) -- the original short paper on Pratt parsing. Lucid.
  • Crockford, "Top Down Operator Precedence" (crockford.com/javascript/tdop/tdop.html) -- the modern JavaScript-flavored re-explanation.

Optional but illuminating

  • Friedman & Wand, Essentials of Programming Languages -- the textbook for the "small interpreter for many language features" approach. Especially good on environments and closures.
  • Steele & Sussman, "The Art of the Interpreter" (1978 MIT memo) -- historic. Beautiful.

5. Curated tutorial list (from BYO-X)

This corresponds to the "Programming Language" section of the BYO-X catalog (lower-effort, interpreter-focused entries -- full compilers are in the next tutorial):

  • (any): mal - Make a Lisp -- github.com/kanaka/mal -- 80+ languages, 16 incremental steps. Excellent template.
  • C: Build Your Own Lisp: Learn C and build your own programming language in 1000 lines of code -- Daniel Holden, buildyourownlisp.com
  • Java: Crafting interpreters: A handbook for making programming languages -- Robert Nystrom ⭐ recommended primary
  • JavaScript: The Super Tiny Interpreter -- the-super-tiny-interpreter
  • JavaScript: Little Lisp interpreter -- Mary Rose Cook
  • JavaScript: How to implement a programming language in JavaScript -- Mihai Bazon
  • JavaScript: Let's go write a Lisp
  • Python: A Python Interpreter Written in Python -- Allison Kaptur
  • Python: lisp.py: Make your own Lisp interpreter
  • Python: How to Write a Lisp Interpreter in Python -- Peter Norvig, lispy -- 90 lines. The single most-loved tiny interpreter on the internet.
  • Python: Let's Build A Simple Interpreter -- Ruslan Spivak
  • Python: Make Your Own Simple Interpreted Programming Language [video]
  • Ruby: Markdown compiler from scratch in Ruby
  • Swift: Building a LISP from scratch with Swift

Robert Nystrom, Crafting Interpreters, Part II (Java). Builds a complete tree-walking interpreter for "Lox" -- a small dynamic language with variables, functions, closures, classes, inheritance. About 1,500 lines of Java.

Read it linearly. Each chapter ends with a working interpreter that does more than the previous.

For a Go version with the same shape: Thorsten Ball, Writing An Interpreter In Go. Shorter language (no classes), and Ball makes Pratt parsing the centerpiece -- excellent treatment.

For 90 lines of Lisp: Norvig's lispy. Read it after the bigger book to appreciate the contrast.

For 10-20 hours of "do it again, your way": Make a Lisp (mal). 16 incremental steps, each a Git tag. Implement Lisp in your favorite language.


7. Implementation milestones

(Lox language, following Nystrom. Translate to your language of choice.)

Milestone 1: Lexer

Turn source text into a stream of tokens.

enum TokenType {
LEFT_PAREN, RIGHT_PAREN, LEFT_BRACE, RIGHT_BRACE,
COMMA, DOT, MINUS, PLUS, SEMICOLON, SLASH, STAR,
BANG, BANG_EQUAL, EQUAL, EQUAL_EQUAL,
GREATER, GREATER_EQUAL, LESS, LESS_EQUAL,
IDENTIFIER, STRING, NUMBER,
AND, CLASS, ELSE, FALSE, FUN, FOR, IF, NIL, OR,
PRINT, RETURN, SUPER, THIS, TRUE, VAR, WHILE,
EOF
}

class Token {
final TokenType type;
final String lexeme;
final Object literal;
final int line;
}

class Scanner {
void scanToken() {
char c = advance();
switch (c) {
case '(': addToken(LEFT_PAREN); break;
case ')': addToken(RIGHT_PAREN); break;
case '!': addToken(match('=') ? BANG_EQUAL : BANG); break;
// ...
default:
if (isDigit(c)) number();
else if (isAlpha(c)) identifier();
else error("Unexpected character.");
}
}
}

Evidence: Tokenize var x = 1 + 2; into the expected 7 tokens.

Milestone 2: Parser (recursive descent)

Build an AST from tokens. Use recursive descent with one function per grammar rule.

expression -> equality ;
equality -> comparison ( ( "!=" | "==" ) comparison )* ;
comparison -> term ( ( ">" | ">=" | "<" | "<=" ) term )* ;
term -> factor ( ( "-" | "+" ) factor )* ;
factor -> unary ( ( "/" | "*" ) unary )* ;
unary -> ( "!" | "-" ) unary | primary ;
primary -> NUMBER | STRING | "true" | "false" | "nil" | "(" expression ")" ;
private Expr expression() { return equality(); }
private Expr equality() {
Expr expr = comparison();
while (match(BANG_EQUAL, EQUAL_EQUAL)) {
Token op = previous();
Expr right = comparison();
expr = new Binary(expr, op, right);
}
return expr;
}

Evidence: Parse 1 + 2 * 3 and (1 + 2) * 3 correctly -- operator precedence works.

Milestone 3: Tree-walking evaluator

Visit the AST. Compute results.

class Interpreter implements Expr.Visitor<Object> {
public Object visitBinaryExpr(Binary expr) {
Object left = evaluate(expr.left);
Object right = evaluate(expr.right);
switch (expr.operator.type) {
case MINUS: return (double)left - (double)right;
case PLUS:
if (left instanceof Double) return (double)left + (double)right;
if (left instanceof String) return (String)left + (String)right;
throw new RuntimeError(...);
case SLASH: return (double)left / (double)right;
case STAR: return (double)left * (double)right;
// ...
}
}
}

Evidence: Evaluate (3 + 4) * (5 - 1) -> 28.

Milestone 4: Statements, variables, assignment

Add statements (print, expression statement, variable declaration). Add an environment for variables.

class Environment {
private final Map<String, Object> values = new HashMap<>();
private final Environment enclosing;

Object get(Token name) {
if (values.containsKey(name.lexeme)) return values.get(name.lexeme);
if (enclosing != null) return enclosing.get(name);
throw new RuntimeError(name, "Undefined variable '" + name.lexeme + "'.");
}
}

Evidence: Run var x = 5; print x; x = x + 1; print x; -> 5\n6\n.

Milestone 5: Control flow

if/else, while, for (desugared to while). Logical and/or with short-circuit evaluation.

Evidence: Run a FizzBuzz program.

Milestone 6: Functions and closures

Functions are first-class. They capture their enclosing environment, becoming closures.

class LoxFunction implements LoxCallable {
private final Stmt.Function declaration;
private final Environment closure;

public Object call(Interpreter interpreter, List<Object> arguments) {
Environment env = new Environment(closure);
for (int i = 0; i < declaration.params.size(); i++)
env.define(declaration.params.get(i).lexeme, arguments.get(i));
try {
interpreter.executeBlock(declaration.body, env);
} catch (Return r) {
return r.value;
}
return null;
}
}

Evidence: Run the classic counter-closure:

fun makeCounter() {
var count = 0;
fun increment() { count = count + 1; return count; }
return increment;
}
var c = makeCounter();
print c(); print c(); print c(); // 1 2 3

Milestone 7 (optional): Classes

Inheritance, methods, this. Nystrom covers this; Ball does not. Skip if you'd rather move on to the Compiler tutorial.


8. Tests & evidence

TestHow
LexerEach token type matched correctly; line numbers preserved on errors.
Operator precedence1 + 2 * 3 = 7, not 9.
Associativity1 - 2 - 3 = -4 (left), not 2.
Variable scopeVariables defined in a block do not leak.
Closure capturemakeCounter test (above).
Recursionfun fib(n) { ... } computes fib(10) = 55.
Error reportingSyntax errors show line number and an explanation, not just "SyntaxError".
Runtime errorsDivide by zero, undefined variable, calling non-callable.
PerformanceThe infamous Fibonacci benchmark. Tree-walking will be 10-100x slower than a real VM. Document this -- it motivates the compiler.

9. Common pitfalls

  • Treating the parser as one big function. Each grammar rule should be one function. Otherwise you fight precedence forever.
  • Mixing parsing and evaluation. Parser produces AST; evaluator visits AST. Don't fold them.
  • Closures without scope chains. A closure captures the environment, not the value. Walking back through environments is what makes the counter example work.
  • String concatenation conflated with addition. Decide your semantics. Either you support both with type checks, or only one of them.
  • Forgetting return. Use exceptions or sentinel values to unwind out of function bodies. Don't return from inside the visitor pattern naively.
  • Re-evaluating expressions you've already evaluated. Especially in or/and -- these need short-circuit semantics.
  • No environment chain. Without nested environments, blocks share variables. Nesting is mandatory.

10. Extensions

  • Static analysis pass. Resolver for variable bindings. Nystrom covers this in Chapter 11.
  • Lambdas / arrow functions. Anonymous function syntax. Trivial extension once functions work.
  • Pattern matching. Bigger language change. Look at OCaml or Rust pattern syntax.
  • Tail-call optimization. Required for recursive functional style.
  • Macros. Lisp-style. mal covers macros in step 8.
  • Run a real benchmark. Compare your interpreter to Python on a small numeric workload.

The natural next step is the Compiler tutorial -- Nystrom's Part III takes the same Lox language and compiles it to bytecode for 10-100x speedup.


11. Module integration

ModuleWhat the interpreter deepens
Sem 4 Module 1 -- C/Go/Rust fundamentalsLargest single project in the systems phase; forces good code organization.
Sem 4 Module 5 -- Abstraction & interpretationThe whole module exists for this project.
Sem 7 Module 2 -- Architecture & modularityLexer/parser/AST/evaluator are textbook modular boundaries.
Compiler tutorialSame source language, replace evaluator with bytecode VM.
Web Browser EngineHTML and CSS need parsers. Same Pratt-parsing technique.
Search EngineQuery parsing reuses recursive descent.

12. Portfolio framing

What to publish:

  • Source organized as src/lexer/, src/parser/, src/eval/, src/runtime/.
  • A tests/ folder with .lox files for each language feature.
  • A REPL -- ./interpreter with no args drops you into an interactive prompt.
  • A README with a "show, don't tell" demo: a Fibonacci function, a closure example, an inheritance example.

Reviewer entry points:

  • src/lexer/scanner.c -- tokenizer.
  • src/parser/parser.c -- grammar rule per function.
  • src/eval/interpreter.c -- the visitor pattern.
  • tests/closure_test.lox -- the closure example shows the design works.
  • README: include a note on tree-walking performance and a pointer to your compiler project (if you do one).

13. Local source backbone

Use Build Your Own Programming Language (build-your-own/programming-language-jeffery) to widen the interpreter project from "tree walker" into a complete language-front-end study.

Local chunksUse them forAdd to this project
004-008Why build a language, requirements, and language-design choicesStart with a language design memo: purpose, audience, data types, syntax style, and non-goals.
009-023Tokens, lexical attributes, regex rules, scanner construction, and scanner testingExpand Milestone 1 with a token-spec document and scanner error tests.
024-032Context-free grammars, parser conflicts, syntax-error recoveryAdd grammar-first parser work: BNF, invalid-program tests, and better parse errors.
033-043Syntax trees, parse trees versus ASTs, tree printing, DOT visualizationAdd AST visualization as required evidence.
044-060Declarations, scopes, symbol tables, base types, arrays, calls, returns, instance accessAdd a semantic-analysis layer before evaluation. Even a dynamic interpreter benefits from explicit scope and binding checks.
067-073Syntax highlighting and IDE supportOptional extension: VS Code grammar and a tiny language-server diagnostic command.

Extra checkpoints from the book chunks

  1. Language brief: define the smallest useful language and name at least three features intentionally excluded.
  2. Scanner contract: for every token class, include one valid example and one invalid near-miss.
  3. Parser recovery: show that one syntax error does not hide every later error in the file.
  4. Scope audit: include tests for shadowing, unbound names, duplicate declarations, and closure capture.

14. Deep project spec

Project contract

Build a small interpreted language with lexical scoping, expressions, statements, variables, functions, closures, and a deterministic error model. The language does not need classes, modules, or a standard library on the first pass. It does need a written grammar, AST shape, resolver or symbol-table story, and REPL or file runner.

Source-backed reading map

Source IDUse forRequired output
build-your-own/programming-language-jefferyscanner, parser, grammar, AST, semantic analysis, runtime architecturegrammar note, AST dump, resolver tests
build-your-own/programming-language-affanlexer automata, DFA/NFA thinking, tokenization edge casestokenizer fixture suite and invalid-token cases

Milestone map

MilestoneDeliverableTestsFailure case
Lexertoken stream with source spansgolden token snapshotsunknown character, unterminated string
ParserAST for expressions and statementsAST snapshots for precedence and groupingmalformed expression recovery
Evaluatorarithmetic, booleans, variablesinterpreter output fixturestype error with line number
Scope resolverlexical scope and closuresshadowing and closure fixturesundefined variable before runtime
Functionscall frames and return valuesrecursion and higher-order function testsarity mismatch
REPL/file runnerusable CLIcommand-level smoke testssyntax error does not crash process

Test matrix

Test typeRequired examples
Goldentokens, AST, and program output fixtures
Semanticundefined names, duplicate declarations, invalid assignment
Runtimetype errors, arity errors, recursion depth behavior
Differentialcompare compiled examples against a simple reference evaluator where possible
Propertyparser never panics on random token streams

Design notes required

  • grammar.md: grammar, precedence table, and rejected syntax.
  • runtime.md: value representation, environment chain, function/closure model.
  • errors.md: syntax, semantic, and runtime error categories.

Portfolio evidence

Publish the grammar, three small programs, token/AST dumps for one program, resolver tests, and a short trace showing one function call entering and leaving scope.


Source

This tutorial draws from the BYO-X catalog "Programming Language" section. Crafting Interpreters is the canonical reference; Writing An Interpreter In Go is the alternative for Go fans.