Module 5: Abstraction & Interpretation
Primary text: Structure and Interpretation of Computer Programs (Abelson, Sussman, Sussman) Contrasts: The C Programming Language (K&R), Computer Organization and Design (Patterson & Hennessy), CODE (Petzold) External selective reading: Peter Norvig's (How to Write a (Lisp) Interpreter (in Python)) (lis.py), Bob Nystrom's Crafting Interpreters
This guide is the primary teacher. You do not need to read SICP cover-to-cover to complete this module. You do need to build an internalized model of how abstraction, evaluation, and interpretation relate -- the model Scheme and Lisp have given the rest of the industry.
Scope of This Module
This is the module where everything in Semester 4 stops being about C-level machine detail and starts being about the linguistic choices that engineers have to make. C gave us pointers and memory; SICP gives us evaluation, environments, and the programs-as-data idea.
What it covers in depth:
- procedural abstraction: black-box procedures as the most basic engineering move
- data abstraction, abstraction barriers, and the design of abstract data types
- layering abstractions to keep large systems readable over years
- first-class procedures, higher-order functions, closures, lexical scope
- the substitution model and the environment model as two ways to predict behavior
- applicative-order vs normal-order evaluation and where laziness helps
- tail calls and the Scheme claim that iteration is disguised recursion
evalandapply, writing a tiny Lisp interpreter, and the staging move toward compilation- assignment, the cost of mutation, streams as a pure alternative, and concurrency
What it deliberately does not try to finish here:
- full compiler construction and optimization passes
- concurrent-system design (returns in Semester 5/6)
- type systems, effects, and advanced programming-language theory
- SICM's physics or the full logic-programming chapter of SICP
This is a language-and-interpretation module. You should finish it able to read any dispatcher-shaped piece of code (SQL planners, template engines, shell interpreters, DSL evaluators) and see it as a small eval / apply.
Before You Start (Warm-Up)
Answer these closed-book. No looking up.
- In one sentence, what is the difference between a procedural abstraction and a data abstraction?
- Why do Scheme and Python have closures, but C does not (directly)?
- Give a one-line program whose output the substitution model cannot predict.
- What is a tail call, and why does Scheme make it a language requirement?
- What do
evalandapplydo, and why is the separation valuable?
Diagnostic Interpretation
4-5 solid answers -- you are ready for the full path.
2-3 solid answers -- continue, but expect extra time in Clusters 3 and 4 (evaluation models, interpreter).
0-1 solid answers -- revisit S4M1 (C fundamentals), S4M4 (systems-level programming), and S3's design clusters first. This module assumes you can already read C and Python comfortably.
What This Module Is For
This module shows up every time you are asked questions like:
- why is this DSL so slow, and what would make it fast?
- why does my callback see the wrong value of
i? - why does this recursive function overflow in Python but not in Scheme?
- what does my ORM do when I call
.filter(...).map(...)? - what's actually in a closure, and why do I keep leaking memory in this cache?
- how do I think about a shared mutable map across threads?
It builds the judgment needed for:
- designing and evaluating small DSLs and rule engines
- reading the source of a Python, Racket, or JavaScript runtime
- making informed language choices for a new service
- understanding why certain styles (functional, reactive, actor-based) exist
You are learning to think about languages, not to memorize Scheme syntax.
Concept Map
How To Use This Module
Work in order. Each cluster depends on the ones before it; the interpreter in Cluster 4 silently uses every idea from Clusters 1-3.
Cluster 1: Abstraction as the Engineering Move
| Order | Concept | Type | Focus |
|---|---|---|---|
| 1 | Procedural Abstraction: Procedures as Black Boxes | PRIMARY | Naming computation and hiding the body |
| 2 | Data Abstraction and Abstract Data Types | PRIMARY | Constructors, selectors, abstraction barriers |
| 3 | Layering Abstractions to Tame Complexity | PRIMARY | Stratified design, multiple representations |
Cluster mastery check: Can you describe, for some existing codebase, at least three abstraction barriers and one place where a barrier has leaked?
Cluster 2: Higher-Order Procedures and Closures
| Order | Concept | Type | Focus |
|---|---|---|---|
| 4 | First-Class Procedures and Higher-Order Functions | PRIMARY | Procedures as values, arguments, return values |
| 5 | Closures, Lexical Scope, and Captured Environments | PRIMARY | Code + captured frame, lexical scope |
| 6 | Function Composition, map, filter, reduce as Abstractions | SUPPORTING | Pipelines over sequences |
Cluster mastery check: Can you explain and reproduce the Python loop-variable closure bug, and fix it?
Cluster 3: Evaluation Models
| Order | Concept | Type | Focus |
|---|---|---|---|
| 7 | The Substitution Model vs the Environment Model | PRIMARY | Two ways to predict behavior; where they disagree |
| 8 | Applicative-Order vs Normal-Order Evaluation | SUPPORTING | When arguments evaluate, and why if is special |
| 9 | Tail Calls and Iteration as Disguised Recursion | PRIMARY | Recursive procedure vs iterative process; TCO |
Cluster mastery check: Can you trace a closure-using program through both substitution and the environment model, and point to where substitution breaks?
Cluster 4: Metacircular and Interpreter Design
| Order | Concept | Type | Focus |
|---|---|---|---|
| 10 | What an Interpreter Is: eval and apply | PRIMARY | The two-procedure skeleton of any evaluator |
| 11 | Writing a Small Lisp / Scheme Interpreter | PRIMARY | Build one, by hand, in 30-50 lines |
| 12 | From Interpretation to Compilation: Staging, IR | SUPPORTING | Separating analysis from execution; bytecode sketch |
Cluster mastery check: Can you write an evaluator that runs (define fact (lambda (n) (if (= n 0) 1 (* n (fact (- n 1)))))) (fact 5) correctly, and explain which line is responsible for lexical scope?
Cluster 5: State, Mutation, and Language Design
| Order | Concept | Type | Focus |
|---|---|---|---|
| 13 | Introducing Assignment: the Cost of Mutation | PRIMARY | What set! breaks and what it buys |
| 14 | Streams, Laziness, and Infinite Data Structures | SUPPORTING | Pure alternative to mutable time |
| 15 | Concurrency and Language-Level Abstractions over Mutation | SUPPORTING | Locks, atomics, actors, immutability |
Cluster mastery check: Can you describe a workload where immutable data + an event loop is strictly better than shared mutable state + locks, and one where the reverse is true?
Then work through the practice pages:
| Order | Practice path | Focus |
|---|---|---|
| 1 | Abstraction and HOF Lab | Build a small numerical library in Scheme/Python using HOFs and closures |
| 2 | Evaluation Models Workshop | Trace programs in both substitution and environment models |
| 3 | Interpreter Construction Clinic | Grow the tiny Lisp interpreter with extra forms and primitives |
| 4 | Code Katas | map/filter/reduce in C and Scheme; eval/apply in Python or C; tail-call iteration |
Use Module Quiz after concepts and practice. Use Reference and Selective Reading and Learning Resources only for targeted reinforcement.
Learning Objectives
By the end of this module you should be able to:
- Explain and apply procedural and data abstraction in Scheme, Python, and C, and identify when a barrier has leaked.
- Describe stratified design and name the layers of a real codebase you work in.
- Use first-class procedures to express general methods; write, read, and debug closures.
- Use
map,filter,reduce, andcomposefluently, and know when not to use them. - Trace programs by hand in both the substitution and environment models and identify where they agree and disagree.
- Distinguish applicative-order and normal-order evaluation; explain the role of special forms; know what
delay/forcedo. - Distinguish a recursive procedure from a recursive process, and write tail-recursive code that runs in constant space.
- Implement a tiny Lisp interpreter with
evalandapplyin 30-50 lines of Python or Scheme, supportingif,lambda,define,set!,begin, and application. - Describe the spectrum from tree-walking interpreter to bytecode VM to compiler, and name at least two real systems at each rung.
- Reason about mutation, streams as a pure alternative, and the main language-level abstractions over shared mutable state.
Outputs
- one worked set of traces of the same program in substitution and environment models
- one small numerical library in Scheme using
sum,product,integral, andfixed-point, plus a Python port - one 30-50 line tree-walking Lisp interpreter that passes the tests in Practice 3
- one analyzed (staged) version of the same interpreter with a measured speedup
- one kata log implementing
map,filter,reducein both C and Scheme - one written comparison of four concurrency strategies (lock, per-item lock, CAS, actor) for a shared-inventory example
- one mistake log with at least 8 real bugs such as
closure over loop variable,if evaluated both branches,forgot to memoize delay,used the caller's env not the definer's,non-tail recursion killed my Python stack - one short memo on how the habits of this module carry into S5 OS/networking and S6 distributed systems
Completion Standard
You have completed Module 5 when all of these are true:
- you can trace a closure-using program in both evaluation models
- you can write and debug a tail-recursive Scheme loop and explain its space complexity
- you can build, from a blank file, a Lisp evaluator that handles
if,lambda,define,set!, and application with correct lexical scope - you can diagnose a subtle closure or scope bug in Python or JavaScript
- you can describe, for a real system you use, where in the interpretation-compilation spectrum it sits
- you can talk about mutation cost with precision instead of slogans
- you can pick a reasonable concurrency abstraction for a small design exercise and defend the choice
If you can recite the SICP vocabulary but cannot implement Concept 11, the module is not complete.
Reading Policy
- Concept pages are the main path.
- SICP chunks are selective reinforcement, not a second syllabus. Most pages list 3-5 chunks in "Read This Only If Stuck."
- External links (Norvig's lispy, Crafting Interpreters, MIT Press SICP) are small and focused.
- K&R chunks are for contrast (what mainstream C gives you vs what SICP's Scheme does).
Optional deep divemeans additional nuance, not required progression.
Suggested Weekly Flow
| Day | Work |
|---|---|
| 1 | Concepts 1-3, small ADT drill |
| 2 | Concepts 4-6, make-adder, compose, map/filter/reduce in two languages |
| 3 | Concepts 7-9, hand traces of substitution and environment models; tail-recursion drill |
| 4 | Concept 10, read Norvig lispy; Concept 11, start the interpreter |
| 5 | Finish the interpreter, Concept 12; benchmark analyzed version |
| 6 | Concepts 13-15; design the inventory-service comparison |
| 7 | Practice 1-4 and the quiz |
Reference
If you need exact links into the local SICP chunks, see Reference and Selective Reading.
Use this progression: Interpreter for parsing and evaluation, Compiler for bytecode and VM design, Source Code to Machine Code for native lowering, and MLIR Compiler Pipeline only after the basic compiler is working. See Build Your Own X overview.
Rich Learning Pages
Worked Examples | Guided Labs | Case Studies | Mistake Clinic | Reading Guide | Capstone Thread