Module 5: Abstraction & Interpretation

Primary text: Structure and Interpretation of Computer Programs (Abelson, Sussman, Sussman) Contrasts: The C Programming Language (K&R), Computer Organization and Design (Patterson & Hennessy), CODE (Petzold) External selective reading: Peter Norvig's (How to Write a (Lisp) Interpreter (in Python)) (lis.py), Bob Nystrom's Crafting Interpreters

This guide is the primary teacher. You do not need to read SICP cover-to-cover to complete this module. You do need to build an internalized model of how abstraction, evaluation, and interpretation relate -- the model Scheme and Lisp have given the rest of the industry.

Scope of This Module

This is the module where everything in Semester 4 stops being about C-level machine detail and starts being about the linguistic choices that engineers have to make. C gave us pointers and memory; SICP gives us evaluation, environments, and the programs-as-data idea.

What it covers in depth:

procedural abstraction: black-box procedures as the most basic engineering move
data abstraction, abstraction barriers, and the design of abstract data types
layering abstractions to keep large systems readable over years
first-class procedures, higher-order functions, closures, lexical scope
the substitution model and the environment model as two ways to predict behavior
applicative-order vs normal-order evaluation and where laziness helps
tail calls and the Scheme claim that iteration is disguised recursion
eval and apply, writing a tiny Lisp interpreter, and the staging move toward compilation
assignment, the cost of mutation, streams as a pure alternative, and concurrency

What it deliberately does not try to finish here:

full compiler construction and optimization passes
concurrent-system design (returns in Semester 5/6)
type systems, effects, and advanced programming-language theory
SICM's physics or the full logic-programming chapter of SICP

This is a language-and-interpretation module. You should finish it able to read any dispatcher-shaped piece of code (SQL planners, template engines, shell interpreters, DSL evaluators) and see it as a small eval / apply.

Before You Start (Warm-Up)

Answer these closed-book. No looking up.

In one sentence, what is the difference between a procedural abstraction and a data abstraction?
Why do Scheme and Python have closures, but C does not (directly)?
Give a one-line program whose output the substitution model cannot predict.
What is a tail call, and why does Scheme make it a language requirement?
What do eval and apply do, and why is the separation valuable?

Diagnostic Interpretation

4-5 solid answers -- you are ready for the full path.

2-3 solid answers -- continue, but expect extra time in Clusters 3 and 4 (evaluation models, interpreter).

0-1 solid answers -- revisit S4M1 (C fundamentals), S4M4 (systems-level programming), and S3's design clusters first. This module assumes you can already read C and Python comfortably.

What This Module Is For

This module shows up every time you are asked questions like:

why is this DSL so slow, and what would make it fast?
why does my callback see the wrong value of i?
why does this recursive function overflow in Python but not in Scheme?
what does my ORM do when I call .filter(...).map(...)?
what's actually in a closure, and why do I keep leaking memory in this cache?
how do I think about a shared mutable map across threads?

It builds the judgment needed for:

designing and evaluating small DSLs and rule engines
reading the source of a Python, Racket, or JavaScript runtime
making informed language choices for a new service
understanding why certain styles (functional, reactive, actor-based) exist

You are learning to think about languages, not to memorize Scheme syntax.

Concept Map

How To Use This Module

Work in order. Each cluster depends on the ones before it; the interpreter in Cluster 4 silently uses every idea from Clusters 1-3.

Cluster 1: Abstraction as the Engineering Move

Order	Concept	Type	Focus
1	Procedural Abstraction: Procedures as Black Boxes	PRIMARY	Naming computation and hiding the body
2	Data Abstraction and Abstract Data Types	PRIMARY	Constructors, selectors, abstraction barriers
3	Layering Abstractions to Tame Complexity	PRIMARY	Stratified design, multiple representations

Cluster mastery check: Can you describe, for some existing codebase, at least three abstraction barriers and one place where a barrier has leaked?

Cluster 2: Higher-Order Procedures and Closures

Order	Concept	Type	Focus
4	First-Class Procedures and Higher-Order Functions	PRIMARY	Procedures as values, arguments, return values
5	Closures, Lexical Scope, and Captured Environments	PRIMARY	Code + captured frame, lexical scope
6	Function Composition, `map`, `filter`, `reduce` as Abstractions	SUPPORTING	Pipelines over sequences

Cluster mastery check: Can you explain and reproduce the Python loop-variable closure bug, and fix it?

Cluster 3: Evaluation Models

Order	Concept	Type	Focus
7	The Substitution Model vs the Environment Model	PRIMARY	Two ways to predict behavior; where they disagree
8	Applicative-Order vs Normal-Order Evaluation	SUPPORTING	When arguments evaluate, and why `if` is special
9	Tail Calls and Iteration as Disguised Recursion	PRIMARY	Recursive procedure vs iterative process; TCO

Cluster mastery check: Can you trace a closure-using program through both substitution and the environment model, and point to where substitution breaks?

Cluster 4: Metacircular and Interpreter Design

Order	Concept	Type	Focus
10	What an Interpreter Is: `eval` and `apply`	PRIMARY	The two-procedure skeleton of any evaluator
11	Writing a Small Lisp / Scheme Interpreter	PRIMARY	Build one, by hand, in 30-50 lines
12	From Interpretation to Compilation: Staging, IR	SUPPORTING	Separating analysis from execution; bytecode sketch

Cluster mastery check: Can you write an evaluator that runs (define fact (lambda (n) (if (= n 0) 1 (* n (fact (- n 1)))))) (fact 5) correctly, and explain which line is responsible for lexical scope?

Cluster 5: State, Mutation, and Language Design

Order	Concept	Type	Focus
13	Introducing Assignment: the Cost of Mutation	PRIMARY	What `set!` breaks and what it buys
14	Streams, Laziness, and Infinite Data Structures	SUPPORTING	Pure alternative to mutable time
15	Concurrency and Language-Level Abstractions over Mutation	SUPPORTING	Locks, atomics, actors, immutability

Cluster mastery check: Can you describe a workload where immutable data + an event loop is strictly better than shared mutable state + locks, and one where the reverse is true?

Then work through the practice pages:

Order	Practice path	Focus
1	Abstraction and HOF Lab	Build a small numerical library in Scheme/Python using HOFs and closures
2	Evaluation Models Workshop	Trace programs in both substitution and environment models
3	Interpreter Construction Clinic	Grow the tiny Lisp interpreter with extra forms and primitives
4	Code Katas	`map`/`filter`/`reduce` in C and Scheme; `eval`/`apply` in Python or C; tail-call iteration

Use Module Quiz after concepts and practice. Use Reference and Selective Reading and Learning Resources only for targeted reinforcement.

Learning Objectives

By the end of this module you should be able to:

Explain and apply procedural and data abstraction in Scheme, Python, and C, and identify when a barrier has leaked.
Describe stratified design and name the layers of a real codebase you work in.
Use first-class procedures to express general methods; write, read, and debug closures.
Use map, filter, reduce, and compose fluently, and know when not to use them.
Trace programs by hand in both the substitution and environment models and identify where they agree and disagree.
Distinguish applicative-order and normal-order evaluation; explain the role of special forms; know what delay/force do.
Distinguish a recursive procedure from a recursive process, and write tail-recursive code that runs in constant space.
Implement a tiny Lisp interpreter with eval and apply in 30-50 lines of Python or Scheme, supporting if, lambda, define, set!, begin, and application.
Describe the spectrum from tree-walking interpreter to bytecode VM to compiler, and name at least two real systems at each rung.
Reason about mutation, streams as a pure alternative, and the main language-level abstractions over shared mutable state.

Outputs

one worked set of traces of the same program in substitution and environment models
one small numerical library in Scheme using sum, product, integral, and fixed-point, plus a Python port
one 30-50 line tree-walking Lisp interpreter that passes the tests in Practice 3
one analyzed (staged) version of the same interpreter with a measured speedup
one kata log implementing map, filter, reduce in both C and Scheme
one written comparison of four concurrency strategies (lock, per-item lock, CAS, actor) for a shared-inventory example
one mistake log with at least 8 real bugs such as closure over loop variable, if evaluated both branches, forgot to memoize delay, used the caller's env not the definer's, non-tail recursion killed my Python stack
one short memo on how the habits of this module carry into S5 OS/networking and S6 distributed systems

Completion Standard

You have completed Module 5 when all of these are true:

you can trace a closure-using program in both evaluation models
you can write and debug a tail-recursive Scheme loop and explain its space complexity
you can build, from a blank file, a Lisp evaluator that handles if, lambda, define, set!, and application with correct lexical scope
you can diagnose a subtle closure or scope bug in Python or JavaScript
you can describe, for a real system you use, where in the interpretation-compilation spectrum it sits
you can talk about mutation cost with precision instead of slogans
you can pick a reasonable concurrency abstraction for a small design exercise and defend the choice

If you can recite the SICP vocabulary but cannot implement Concept 11, the module is not complete.

Reading Policy

Concept pages are the main path.
SICP chunks are selective reinforcement, not a second syllabus. Most pages list 3-5 chunks in "Read This Only If Stuck."
External links (Norvig's lispy, Crafting Interpreters, MIT Press SICP) are small and focused.
K&R chunks are for contrast (what mainstream C gives you vs what SICP's Scheme does).
Optional deep dive means additional nuance, not required progression.

Suggested Weekly Flow

Day	Work
1	Concepts 1-3, small ADT drill
2	Concepts 4-6, `make-adder`, `compose`, `map`/`filter`/`reduce` in two languages
3	Concepts 7-9, hand traces of substitution and environment models; tail-recursion drill
4	Concept 10, read Norvig lispy; Concept 11, start the interpreter
5	Finish the interpreter, Concept 12; benchmark analyzed version
6	Concepts 13-15; design the inventory-service comparison
7	Practice 1-4 and the quiz

Reference

If you need exact links into the local SICP chunks, see Reference and Selective Reading.

Build Your Own X - elective

Use this progression: Interpreter for parsing and evaluation, Compiler for bytecode and VM design, Source Code to Machine Code for native lowering, and MLIR Compiler Pipeline only after the basic compiler is working. See Build Your Own X overview.

Rich Learning Pages

Scope of This Module​

Before You Start (Warm-Up)​

Diagnostic Interpretation​

What This Module Is For​

Concept Map​

How To Use This Module​

Cluster 1: Abstraction as the Engineering Move​

Cluster 2: Higher-Order Procedures and Closures​

Cluster 3: Evaluation Models​

Cluster 4: Metacircular and Interpreter Design​

Cluster 5: State, Mutation, and Language Design​

Learning Objectives​

Outputs​

Completion Standard​

Reading Policy​

Suggested Weekly Flow​

Reference​

Rich Learning Pages​