Skip to main content

The Substitution Model vs the Environment Model

What This Concept Is

We need a way to predict what a program will do. SICP introduces two:

  • Substitution model (§1.1.5). To evaluate (f 7), substitute 7 for the formal parameter everywhere in the body of f, then keep reducing. This works perfectly for a pure functional language: no mutation, no environments, just term rewriting.
  • Environment model (§3.2). To evaluate (f 7), create a new frame that binds the parameter to 7, extend the enclosing environment with that frame, and evaluate the body in the extended environment. Variable lookup walks the chain of frames.

They agree on the value every pure program produces. They stop agreeing the moment mutation appears (set!, Python =, count += 1): substitution has no story for "a variable whose binding can change," but the environment model does -- a frame entry simply gets a new value.

Why It Matters Here

This is the single highest-leverage concept of the module for everything downstream:

  • Closures (Concept 05) are "environment model" plus "procedures as values." A closure is a pair of code and environment pointer.
  • Tail calls (Concept 09) are meaningful only in the environment model; they are about not growing the frame stack.
  • The interpreter (Concepts 10-11) is literally a procedure that walks source expressions while maintaining an explicit environment.
  • Mutation (Concept 13) breaks substitution, which is why you need environments even to explain count += 1.

If you cannot trace a small program in both models and spot where they disagree, the rest of this cluster is floating.

Concrete Example

Substitution, purely functional:

(define (square x) (* x x))
(define (sum-of-squares x y) (+ (square x) (square y)))
(sum-of-squares 3 4)
; -> (+ (square 3) (square 4))
; -> (+ (* 3 3) (* 4 4))
; -> (+ 9 16)
; -> 25

Each step replaces a call by its body, then replaces formal parameters by the argument values. No frames appear.

Environment model, same program:

GLOBAL: { square:<code,GLOBAL>, sum-of-squares:<code,GLOBAL> }

call (sum-of-squares 3 4)
new frame F1: { x:3, y:4 }, parent = GLOBAL
eval body in F1
call (square 3)
new frame F2: { x:3 }, parent = GLOBAL
eval (* x x) in F2 -> 9
call (square 4)
new frame F3: { x:4 }, parent = GLOBAL
eval (* x x) in F3 -> 16
eval (+ 9 16) -> 25

Both give 25. Now introduce mutation -- the models diverge:

(define (make-counter)
(let ((n 0))
(lambda () (set! n (+ n 1)) n)))
(define c (make-counter))
(c) (c) ; -> 1 then 2

Substitution cannot represent c: there is no value you can substitute for c that produces different results on successive calls. The environment model explains it easily: c is a closure pointing at a frame whose n is being mutated.

Common Confusion / Misconception

  1. "Substitution is wrong." No -- substitution is correct, but partial. Every pure expression evaluated in substitution evaluates the same way in the environment model. Substitution only fails to cover mutation and shared state.
  2. Confusing frames with stack frames. Environment frames are any activation record created when evaluating a procedure body; they may outlive the C-style stack frame if a closure captures them. Stack and environment are different axes.
  3. "Lexical scope means variables live in the stack." No -- they live in frames chained by definition order, not call order. The chain is built at definition time; the C stack is built at call time.

How To Use It

  1. Trace small programs by hand in both models until they feel routine.
  2. When teaching or debugging, start with substitution; switch to environments the moment mutation or a captured free variable appears.
  3. When designing, prefer writing code that can be explained in substitution -- it almost always means fewer surprising interactions.
  4. When reasoning about a closure bug, draw the frame diagram: what frames exist, who points at whom, what is in each frame.

Check Yourself

  1. Give a one-line program whose output the substitution model cannot predict.
  2. Trace (sum-of-squares (+ 1 2) (* 3 4)) in the substitution model, step by step.
  3. In the environment model, what exactly is a "procedure value"?

Mini Drill or Application

Take this program and trace it in both models:

(define (make-adder n) (lambda (x) (+ x n)))
(define a5 (make-adder 5))
(define a7 (make-adder 7))
(+ (a5 10) (a7 10))
  • In the substitution model, reduce until you cannot.
  • In the environment model, draw the frames (GLOBAL, one frame per make-adder call, one per inner lambda call). Identify which frames are alive at the moment the final + runs.

One of the quiz questions for this module is drawn directly from exercises like this.

Read This Only If Stuck