Skip to main content

Module 5: Abstraction & Interpretation

Primary text: Structure and Interpretation of Computer Programs (Abelson, Sussman, Sussman) Contrasts: The C Programming Language (K&R), Computer Organization and Design (Patterson & Hennessy), CODE (Petzold) External selective reading: Peter Norvig's (How to Write a (Lisp) Interpreter (in Python)) (lis.py), Bob Nystrom's Crafting Interpreters

This guide is the primary teacher. You do not need to read SICP cover-to-cover to complete this module. You do need to build an internalized model of how abstraction, evaluation, and interpretation relate -- the model Scheme and Lisp have given the rest of the industry.


Scope of This Module

This is the module where everything in Semester 4 stops being about C-level machine detail and starts being about the linguistic choices that engineers have to make. C gave us pointers and memory; SICP gives us evaluation, environments, and the programs-as-data idea.

What it covers in depth:

  • procedural abstraction: black-box procedures as the most basic engineering move
  • data abstraction, abstraction barriers, and the design of abstract data types
  • layering abstractions to keep large systems readable over years
  • first-class procedures, higher-order functions, closures, lexical scope
  • the substitution model and the environment model as two ways to predict behavior
  • applicative-order vs normal-order evaluation and where laziness helps
  • tail calls and the Scheme claim that iteration is disguised recursion
  • eval and apply, writing a tiny Lisp interpreter, and the staging move toward compilation
  • assignment, the cost of mutation, streams as a pure alternative, and concurrency

What it deliberately does not try to finish here:

  • full compiler construction and optimization passes
  • concurrent-system design (returns in Semester 5/6)
  • type systems, effects, and advanced programming-language theory
  • SICM's physics or the full logic-programming chapter of SICP

This is a language-and-interpretation module. You should finish it able to read any dispatcher-shaped piece of code (SQL planners, template engines, shell interpreters, DSL evaluators) and see it as a small eval / apply.


Before You Start (Warm-Up)

Answer these closed-book. No looking up.

  1. In one sentence, what is the difference between a procedural abstraction and a data abstraction?
  2. Why do Scheme and Python have closures, but C does not (directly)?
  3. Give a one-line program whose output the substitution model cannot predict.
  4. What is a tail call, and why does Scheme make it a language requirement?
  5. What do eval and apply do, and why is the separation valuable?

Diagnostic Interpretation

4-5 solid answers -- you are ready for the full path.

2-3 solid answers -- continue, but expect extra time in Clusters 3 and 4 (evaluation models, interpreter).

0-1 solid answers -- revisit S4M1 (C fundamentals), S4M4 (systems-level programming), and S3's design clusters first. This module assumes you can already read C and Python comfortably.


What This Module Is For

This module shows up every time you are asked questions like:

  • why is this DSL so slow, and what would make it fast?
  • why does my callback see the wrong value of i?
  • why does this recursive function overflow in Python but not in Scheme?
  • what does my ORM do when I call .filter(...).map(...)?
  • what's actually in a closure, and why do I keep leaking memory in this cache?
  • how do I think about a shared mutable map across threads?

It builds the judgment needed for:

  • designing and evaluating small DSLs and rule engines
  • reading the source of a Python, Racket, or JavaScript runtime
  • making informed language choices for a new service
  • understanding why certain styles (functional, reactive, actor-based) exist

You are learning to think about languages, not to memorize Scheme syntax.


Concept Map


How To Use This Module

Work in order. Each cluster depends on the ones before it; the interpreter in Cluster 4 silently uses every idea from Clusters 1-3.

Cluster 1: Abstraction as the Engineering Move

OrderConceptTypeFocus
1Procedural Abstraction: Procedures as Black BoxesPRIMARYNaming computation and hiding the body
2Data Abstraction and Abstract Data TypesPRIMARYConstructors, selectors, abstraction barriers
3Layering Abstractions to Tame ComplexityPRIMARYStratified design, multiple representations

Cluster mastery check: Can you describe, for some existing codebase, at least three abstraction barriers and one place where a barrier has leaked?

Cluster 2: Higher-Order Procedures and Closures

OrderConceptTypeFocus
4First-Class Procedures and Higher-Order FunctionsPRIMARYProcedures as values, arguments, return values
5Closures, Lexical Scope, and Captured EnvironmentsPRIMARYCode + captured frame, lexical scope
6Function Composition, map, filter, reduce as AbstractionsSUPPORTINGPipelines over sequences

Cluster mastery check: Can you explain and reproduce the Python loop-variable closure bug, and fix it?

Cluster 3: Evaluation Models

OrderConceptTypeFocus
7The Substitution Model vs the Environment ModelPRIMARYTwo ways to predict behavior; where they disagree
8Applicative-Order vs Normal-Order EvaluationSUPPORTINGWhen arguments evaluate, and why if is special
9Tail Calls and Iteration as Disguised RecursionPRIMARYRecursive procedure vs iterative process; TCO

Cluster mastery check: Can you trace a closure-using program through both substitution and the environment model, and point to where substitution breaks?

Cluster 4: Metacircular and Interpreter Design

OrderConceptTypeFocus
10What an Interpreter Is: eval and applyPRIMARYThe two-procedure skeleton of any evaluator
11Writing a Small Lisp / Scheme InterpreterPRIMARYBuild one, by hand, in 30-50 lines
12From Interpretation to Compilation: Staging, IRSUPPORTINGSeparating analysis from execution; bytecode sketch

Cluster mastery check: Can you write an evaluator that runs (define fact (lambda (n) (if (= n 0) 1 (* n (fact (- n 1)))))) (fact 5) correctly, and explain which line is responsible for lexical scope?

Cluster 5: State, Mutation, and Language Design

OrderConceptTypeFocus
13Introducing Assignment: the Cost of MutationPRIMARYWhat set! breaks and what it buys
14Streams, Laziness, and Infinite Data StructuresSUPPORTINGPure alternative to mutable time
15Concurrency and Language-Level Abstractions over MutationSUPPORTINGLocks, atomics, actors, immutability

Cluster mastery check: Can you describe a workload where immutable data + an event loop is strictly better than shared mutable state + locks, and one where the reverse is true?

Then work through the practice pages:

OrderPractice pathFocus
1Abstraction and HOF LabBuild a small numerical library in Scheme/Python using HOFs and closures
2Evaluation Models WorkshopTrace programs in both substitution and environment models
3Interpreter Construction ClinicGrow the tiny Lisp interpreter with extra forms and primitives
4Code Katasmap/filter/reduce in C and Scheme; eval/apply in Python or C; tail-call iteration

Use Module Quiz after concepts and practice. Use Reference and Selective Reading and Learning Resources only for targeted reinforcement.


Learning Objectives

By the end of this module you should be able to:

  1. Explain and apply procedural and data abstraction in Scheme, Python, and C, and identify when a barrier has leaked.
  2. Describe stratified design and name the layers of a real codebase you work in.
  3. Use first-class procedures to express general methods; write, read, and debug closures.
  4. Use map, filter, reduce, and compose fluently, and know when not to use them.
  5. Trace programs by hand in both the substitution and environment models and identify where they agree and disagree.
  6. Distinguish applicative-order and normal-order evaluation; explain the role of special forms; know what delay/force do.
  7. Distinguish a recursive procedure from a recursive process, and write tail-recursive code that runs in constant space.
  8. Implement a tiny Lisp interpreter with eval and apply in 30-50 lines of Python or Scheme, supporting if, lambda, define, set!, begin, and application.
  9. Describe the spectrum from tree-walking interpreter to bytecode VM to compiler, and name at least two real systems at each rung.
  10. Reason about mutation, streams as a pure alternative, and the main language-level abstractions over shared mutable state.

Outputs

  • one worked set of traces of the same program in substitution and environment models
  • one small numerical library in Scheme using sum, product, integral, and fixed-point, plus a Python port
  • one 30-50 line tree-walking Lisp interpreter that passes the tests in Practice 3
  • one analyzed (staged) version of the same interpreter with a measured speedup
  • one kata log implementing map, filter, reduce in both C and Scheme
  • one written comparison of four concurrency strategies (lock, per-item lock, CAS, actor) for a shared-inventory example
  • one mistake log with at least 8 real bugs such as closure over loop variable, if evaluated both branches, forgot to memoize delay, used the caller's env not the definer's, non-tail recursion killed my Python stack
  • one short memo on how the habits of this module carry into S5 OS/networking and S6 distributed systems

Completion Standard

You have completed Module 5 when all of these are true:

  • you can trace a closure-using program in both evaluation models
  • you can write and debug a tail-recursive Scheme loop and explain its space complexity
  • you can build, from a blank file, a Lisp evaluator that handles if, lambda, define, set!, and application with correct lexical scope
  • you can diagnose a subtle closure or scope bug in Python or JavaScript
  • you can describe, for a real system you use, where in the interpretation-compilation spectrum it sits
  • you can talk about mutation cost with precision instead of slogans
  • you can pick a reasonable concurrency abstraction for a small design exercise and defend the choice

If you can recite the SICP vocabulary but cannot implement Concept 11, the module is not complete.


Reading Policy

  • Concept pages are the main path.
  • SICP chunks are selective reinforcement, not a second syllabus. Most pages list 3-5 chunks in "Read This Only If Stuck."
  • External links (Norvig's lispy, Crafting Interpreters, MIT Press SICP) are small and focused.
  • K&R chunks are for contrast (what mainstream C gives you vs what SICP's Scheme does).
  • Optional deep dive means additional nuance, not required progression.

Suggested Weekly Flow

DayWork
1Concepts 1-3, small ADT drill
2Concepts 4-6, make-adder, compose, map/filter/reduce in two languages
3Concepts 7-9, hand traces of substitution and environment models; tail-recursion drill
4Concept 10, read Norvig lispy; Concept 11, start the interpreter
5Finish the interpreter, Concept 12; benchmark analyzed version
6Concepts 13-15; design the inventory-service comparison
7Practice 1-4 and the quiz

Reference

If you need exact links into the local SICP chunks, see Reference and Selective Reading.


Build Your Own X - elective

Use this progression: Interpreter for parsing and evaluation, Compiler for bytecode and VM design, Source Code to Machine Code for native lowering, and MLIR Compiler Pipeline only after the basic compiler is working. See Build Your Own X overview.

Rich Learning Pages

Worked Examples | Guided Labs | Case Studies | Mistake Clinic | Reading Guide | Capstone Thread