Skip to main content

Writing a Small Lisp / Scheme Interpreter

What This Concept Is

This is Concept 10 made concrete. You write, by hand, a tiny working interpreter for a Lisp-like subset. The subset is small on purpose:

  • atoms (numbers, symbols, strings)
  • special forms: quote, if, define, set!, lambda, begin
  • procedure application
  • a handful of primitives: +, -, *, /, =, <, car, cdr, cons, null?, list, print

That is enough to run recursive definitions, higher-order procedures, closures, and even a toy metacircular eval. Norvig's lis.py fits the core evaluator in roughly 30 lines of Python. SICP builds the same interpreter in Scheme across §4.1.1-§4.1.4. Nystrom's Crafting Interpreters builds a larger version in Java and C as a full teaching text.

Why It Matters Here

Until you have written one, every concept in this module is a story. After you have written one:

  • you can point at the exact line where a closure is created
  • you can feel why if must be a special form
  • you can debug a production eval-heavy system (template engines, query DSLs) with the right mental model
  • you can read Chapter 5 of SICP without flinching

It also fails in instructive ways. The common failures are lessons in themselves: infinite loops in eval, wrong environment on a closure, forgetting to evaluate arguments, forgetting a base case.

Concrete Example

A minimal evaluator in Python, 30-50 lines, adapted from Norvig's lispy:

Symbol = str
Env = dict

def standard_env():
import math, operator as op
env = {
'+': op.add, '-': op.sub, '*': op.mul, '/': op.truediv,
'=': op.eq, '<': op.lt, '>': op.gt,
'car': lambda x: x[0], 'cdr': lambda x: x[1:],
'cons': lambda a, b: [a] + list(b),
'null?': lambda x: x == [],
'list': lambda *a: list(a),
'print': print,
}
return env

class Procedure:
def __init__(self, params, body, env):
self.params, self.body, self.env = params, body, env
def __call__(self, *args):
local = dict(zip(self.params, args))
return eval_(self.body, ChainMap(local, self.env))

from collections import ChainMap

def eval_(x, env):
if isinstance(x, Symbol): # variable reference
return env[x]
if not isinstance(x, list): # number literal
return x
op, *args = x
if op == 'quote': return args[0]
if op == 'if':
test, conseq, alt = args
return eval_(conseq if eval_(test, env) else alt, env)
if op == 'define':
name, value_exp = args
env[name] = eval_(value_exp, env)
return None
if op == 'set!':
name, value_exp = args
env[name] = eval_(value_exp, env) # simplified scope
return None
if op == 'lambda':
params, body = args
return Procedure(params, body, env)
if op == 'begin':
result = None
for e in args: result = eval_(e, env)
return result
proc = eval_(op, env) # application
vals = [eval_(a, env) for a in args]
return proc(*vals)

With a tiny reader (tokenize -> parse) -- another 15 lines -- this evaluator runs:

(define fact
(lambda (n) (if (= n 0) 1 (* n (fact (- n 1))))))
(fact 5) ; => 120

and higher-order code:

(define make-adder (lambda (n) (lambda (x) (+ x n))))
(define add5 (make-adder 5))
(add5 10) ; => 15

Common Confusion / Misconception

  1. "Parse and eval are the same step." They are not. The reader turns text into nested lists; the evaluator works on those lists. Keep the two cleanly separated -- Norvig's tutorial is explicit about this and it pays off.
  2. Environment is a dict. For a toy it is fine. But scope rules (closures; lexical) require that each call extend the defining environment with a new frame. ChainMap or a custom Env(parent) class implements this clearly; a single flat dict gives dynamic scope by accident.
  3. Forgetting that lambda captures the environment. A common bug: store body and params but not env. The result silently becomes dynamic-scoped and all closure examples break.
  4. Evaluating if's both branches. The quiz will punish this one. if must eval_(test) first and then recurse on exactly one branch.
  5. Skipping quote. Once added, you can test the interpreter by writing (quote (1 2 3)) and making sure you get a list back without evaluating it as a call.

How To Use It

Follow this exact staging (same sequence Crafting Interpreters and lispy both take):

  1. Tokenizer + parser. Text -> list structure only. Test on literals and nested expressions.
  2. Evaluator for atoms and arithmetic. Numbers, symbols, + - * /.
  3. if and define. The first control flow and environment mutation.
  4. lambda and application. This is the moment closures arrive. Test make-adder.
  5. Lexical scope correctness. Write a test that would fail under dynamic scope (a closure over a captured variable shadowed in the caller). Make sure it passes.
  6. Tail-call handling (ambitious stretch): convert the evaluator loop to avoid recursion on tail calls (trampoline or while loop around if). Without it you'll blow Python's stack on deep recursion.

Check Yourself

  1. Why do you write the tokenizer and parser before the evaluator?
  2. What single field in your Procedure representation enforces lexical scope?
  3. What test would expose an evaluator that accidentally uses dynamic scope?

Mini Drill or Application

Implement the full 30-50 line evaluator above. Then make it pass these programs:

  • (define square (lambda (x) (* x x))) (square 7)
  • (define make-counter (lambda () (define n 0) (lambda () (set! n (+ n 1)) n))) (define c (make-counter)) (c) (c) (c) -- hint: requires nested define to work in a begin.
  • (define compose (lambda (f g) (lambda (x) (f (g x))))) ((compose car cdr) (list 1 2 3))

Log every bug you hit. At least two of them will almost certainly be scoping bugs -- this is valuable.

Read This Only If Stuck