Skip to main content

Looking Back Is Where the Compounding Happens

What This Concept Is

Looking back is the fourth and most-skipped Polya phase. It is the deliberate review of a finished solution to extract maximum value from it: verify it, generalize it, simplify it, learn from it.

Four concrete questions structure the phase:

  1. Can I verify the result? Does it satisfy the original conditions? Does it hold on a tiny example and on a boundary?
  2. Can I verify the argument? Is every step sound? Any leaps? What unstated assumption did I rely on?
  3. Can I derive the result differently? A second proof often reveals structure the first one hid.
  4. Can I use the method elsewhere? What class of problems does this technique solve?

Looking back is adversarial: the disposition is to attack your own answer. Re-reading is companionable ("I wrote this, it seems fine"); looking back is suspicious ("where could this be wrong?"). The two activities produce very different artifacts.

The phase is not a luxury. It is the compounding mechanism of all other phases. A solved problem you never reflect on teaches only itself. A solved problem you look back on teaches a category -- and the next time you see the category, you spend minutes, not hours, recognising it.

In Polya's own language, phase four is where heuristic (teachable pattern) is extracted from solution (specific answer). Without it, you keep solving first instances forever; with it, the second and third instances cost a fraction as much.

Why It Matters Here

Most of your long-term growth as a problem solver comes from looking back, not from solving. The difference between "I've done 200 problems" and "I've done 200 problems and reviewed them" is roughly the difference between topic fluency and topic mastery.

In CS:

  • the first solution is rarely the best
  • reviewing generates reusable templates (DP state shapes, invariant styles, data-structure idioms)
  • simplifying the solution makes it teachable, testable, and safer
  • generalising creates library functions, not one-off scripts
  • identifying the method separates "I memorised this" from "I own this"

Skipping this phase is the single biggest cause of "I've solved a hundred problems but I still feel like a beginner." The solved problems did not accumulate into a repertoire because the extraction step was missing.

Forward pipeline: Semester 2 (algorithm design) expects you to name the paradigm you used; that name comes from phase four. Semester 3 (refactoring and design patterns) is looking-back at scale -- patterns are canonicalised reflections. Semester 7 (architecture reviews and post-mortems) is phase four institutionalised: the post-mortem is the organisation's looking-back artifact. Semester 10 (capstone retrospectives) compounds weekly reflections into the final synthesis.

Concrete Examples

Example 1 -- a graph theorem

You just proved: A graph with $n$ vertices and $n$ edges must contain a cycle.

Verify. Tiny case $n = 3$: a triangle has 3 edges and a cycle. A path on 3 vertices has only 2 edges. Boundary case $n = 1$: vacuous (need at least 2 vertices to have an edge). Both check.

Argument. Did the proof require the graph to be connected? Re-read: no, the forest-edge bound applies to any simple graph. Did it require simple edges (no multi-edges)? Yes -- a multigraph with two edges between $u$ and $v$ is already a cycle. Surface the assumption.

Alternative derivation. A forest on $n$ vertices has at most $n - 1$ edges. With $n$ edges we exceed the forest bound, hence the graph is not a forest, hence it contains a cycle. This is shorter and cleaner than the DFS-based argument.

Generalise. Any graph with more than $n - 1$ edges contains a cycle. More sharply: a graph with $n + k$ edges ($k \geq 1$) contains at least $k + 1$ independent cycle-edges. The method extends.

Now the solution is reusable, tighter, and paired with a reusable lemma (the forest-edge bound).

Example 2 -- a probability identity

You just proved: For independent events $A$ and $B$, $P(A \cup B) = P(A) + P(B) - P(A) P(B)$.

Verify. Boundary: if $P(B) = 0$, RHS is $P(A)$. If $A = B$ (degenerate independence, only if $P(A) \in {0, 1}$), RHS equals $P(A)$ too. Check.

Argument. The proof used inclusion-exclusion: $P(A \cup B) = P(A) + P(B) - P(A \cap B)$, then substituted $P(A \cap B) = P(A) P(B)$ by independence. The only step that used independence is the substitution. If $A$ and $B$ are not independent, the substitution fails -- so the identity does too.

Alternative derivation. Complement: $P(A \cup B) = 1 - P(A^c \cap B^c) = 1 - P(A^c) P(B^c) = 1 - (1 - P(A))(1 - P(B)) = P(A) + P(B) - P(A) P(B)$. This derivation exposes a useful reusable lemma: for independent events, the complement of the union factors.

Generalise / transfer. The same derivation gives $P(\bigcup_i A_i) = 1 - \prod_i (1 - P(A_i))$ for pairwise-independent... wait -- does that extend to pairwise? Re-check: no, the complement argument requires mutual independence of all $A_i$. This is exactly the sort of boundary that looking-back surfaces.

Common Confusion / Misconceptions

Re-reading is not looking back. Re-reading confirms what you wrote; looking back attacks it. The disposition is adversarial: try to break your own answer.

Stopping at verify. The other three questions (argument, alternative, transfer) are where the learning lives. Most students stop after "result is right" and skip the compounding steps.

Luxury framing. Treating looking back as a luxury "if I have time." The transfer note you skip today is the pattern you will not recognise six months from now.

Over-generalisation. Turning every specific result into a universal claim. Not every solution generalises; some rely on a specific numeric coincidence. Looking back should honestly mark which parts generalise and which do not.

How To Use It

Add a looking-back block to every non-trivial solved problem:

VERIFICATION:
- tiny case check: ...
- boundary check: ...

ARGUMENT AUDIT:
- assumption I used but did not state: ...
- weakest step: ...

ALTERNATIVE:
- a second approach would be: ...
- what does it reveal that the first did not?

TRANSFER:
- this technique applies to: ...
- a problem in another domain with the same shape: ...

Over time, accumulated transfer notes become your personal pattern library. Read them back before starting a new problem in the same family; the recognition move is what separates a five-minute solve from a fifty-minute one.

Transfer / Where This Shows Up Later

  • Semester 2 (algorithm design). The "algorithmic paradigm" label (greedy, D&C, DP, flow) is always a looking-back artifact.
  • Semester 3 (design patterns, refactoring). Patterns are canonical looking-back artifacts: "here is the reusable skeleton extracted from thirty similar solutions."
  • Semester 4 (debugging). Post-incident analysis is phase four at system scale -- "what did we learn that would prevent the next one?"
  • Semester 5 (protocol design). An RFC's "Security Considerations" and "Implementation Experience" sections are looking-back institutionalised.
  • Semester 7 (architecture reviews). ADR "Consequences" and post-decision reviews are phase-four discipline at organisational scale.
  • Semester 10 (capstone). Weekly retrospectives compound insights; without them the capstone becomes fifteen weeks of first-instance problem solving.

Check Yourself

  1. Why is re-reading a solution not the same as looking back at it? What changes between the two?
  2. What new information does an alternative derivation give you that the first derivation did not?
  3. Give an example of a solved problem where the method is more valuable than the answer.
  4. Why does honest "this does not generalise" count as a phase-four success? What would over-generalising cost you?
  5. If you could keep only one of the four looking-back questions, which gives the most long-term compounding, and why?

Mini Drill or Application

Drill A. Pick one solved problem from Module 2 (graphs) or Module 3 (probability). Write a full looking-back block, all four sections. The drill is complete when you have at least one transfer note: a new problem class your method applies to.

Drill B. Take a recent $O(n^2)$ solution you wrote. Look back specifically for efficiency: "can I derive this differently?" Often the looking-back alternative is an $O(n \log n)$ or $O(n)$ algorithm. Write both and compare.

Drill C (unseen). Take a classic: There are infinitely many primes. Write a looking-back block. Then write a second looking-back block focused exclusively on the alternative question: produce three different proofs (Euclid, Euler's divergence of $\sum 1/p$, Furstenberg's topological proof) and note which one makes the generalisation "infinitely many primes of the form $4k + 3$" easiest.

Read This Only If Stuck