Skip to main content

Problem-Solving Under Deadlines: Triage Under Partial Information

What This Concept Is

Most of Module 5 assumes you have enough time to do the problem right: understand fully, plan carefully, execute deliberately, look back. Real work does not always grant that time. You are often expected to:

  • produce a defensible answer under a deadline
  • decide without complete information
  • choose the "second-best" plan because the best plan will not finish in time
  • know what to abandon when you cannot do everything

Triage is the discipline of allocating limited effort to the sub-problem that pays off fastest, and explicitly accepting that other sub-problems will not be solved. Shipping something defensible is the act of delivering a partial solution that states its own assumptions and limits, so a reviewer can trust what was decided and identify what is still open.

This is a supporting concept because it is not a new heuristic; it is the selection policy that decides which of the prior concepts to deploy when the clock is a binding constraint. It sits on top of concepts 1-12: understanding, planning, execution, and the heuristics still matter, but their depth is tuned by the deadline.

The discipline has three interlocking sub-skills: (1) scoping -- deciding which parts of the problem you will actually attempt; (2) sequencing -- ordering attempts by information-per-unit-time; (3) writing up -- producing a report that survives handoff, so a decision stands even if you run out of time.

Why It Matters Here

In school, a missed deadline is a lower grade. In practice, a missed deadline is usually worse than a partial answer -- it means a decision is made without you.

This concept matters because:

  • interview settings are deadline-bounded problem solving under partial information
  • on-call incident response is triage: you cannot read every log; you must decide which to read
  • the capstone in Semester 10 will surface real constraints where "optimal" is unreachable within the time budget
  • engineering work in any job is a steady stream of time-boxed investigations

The habits trained in concept 2 (devising a plan) and concept 11 (debugging reasoning) do not disappear under a deadline -- they get compressed. Triage decides which of them to compress further and which to protect.

Forward pipeline: Semester 2 (algorithmic approximations when exact is infeasible in time), Semester 3 (refactor vs. patch decisions under ship pressure), Semester 4 (incident response), Semester 5 (backpressure and shedding as systems-level triage), Semester 7 (architectural decisions with incomplete data -- an ADR with "reversibility" cost baked in), Semester 10 (capstone scope triage: cutting features to hit a demo date).

Concrete Examples

Example 1 -- the 30-minute incident

Scenario. A teammate escalates: "Payments dashboard is showing wrong totals for the last hour. The CFO wants an update in 30 minutes."

You have three candidate sub-problems, not equally cheap:

  1. Read the dashboard query to see whether the aggregation is correct (5 minutes, moderate information).
  2. Diff the last deploy to see whether any code path touching totals changed (10 minutes, high information if the answer is "yes").
  3. Compare raw payment events to the aggregated totals (20 minutes, definitive but expensive).

Triage protocol:

  • Which sub-problem is fastest to check? (1).
  • Which sub-problem is definitive? (3).
  • Which sub-problem is most likely to produce a lead? (2), because incidents correlate with recent changes.

A novice picks (3) because it is definitive. In 30 minutes, this is the wrong call: you spend the budget on one path and have no fallback if the path is inconclusive.

A practiced solver pipelines: start (3) running in the background (it needs 20 minutes anyway), then do (1) (5 min) and (2) (10 min) interactively. At minute 15 you already have two leads; at minute 25 you have the definitive check.

The update at minute 30 is a defensible partial answer: "Query is correct. Recent deploy touched the rollup job; suspect commit linked. Raw comparison confirms the discrepancy matches the commit's change. I am rolling back; remediation in 10 minutes."

That is triage: parallelising where possible, ordering by information-per-minute, and scoping the report to state what is known and what is not.

Example 2 -- the take-home exam

Scenario. A take-home exam with six problems and four hours. You estimate:

ProblemConfidenceEstimated time
1high20 min
2high30 min
3medium45 min
4low60 min
5medium45 min
6unknown?

Total estimate: $20 + 30 + 45 + 60 + 45 + ?$ = at least 3h20m before problem 6, leaving $\le$ 40 min for it. A novice works straight through top-to-bottom, gets stuck on 4, sinks 90 minutes, and ships 3.5 problems.

Triage:

  1. Skim all six first. Spend 10 minutes reading every problem before writing anything. This reveals which "low-confidence" problems are actually easy and which are hard.
  2. Schedule by expected-points-per-minute. Do the high-confidence problems first -- they lock in points and warm up your thinking. Order: 1, 2, then the cheaper of ${3, 5}$, then the other.
  3. Timebox the hard problem. Give problem 4 a strict 45-minute slot. If you are not 50% of the way at minute 20, move on; come back if time remains.
  4. Reserve the last 20 minutes for partial-credit write-ups on unfinished problems: state the approach, the known-true sub-result, and what you would do with more time. Graders routinely award partial credit to well-structured partial solutions.

With triage, you ship 5 full solutions and 1 defensible partial. Without, you ship 3.5 solutions of mixed completeness.

Common Confusion / Misconceptions

Perfectionism under time pressure. Treating a 30-minute problem like a homework problem. Refusing to deliver a partial answer because it is "not done." Under a deadline, "done well enough with explicit caveats" beats "perfect but late."

Irreversibility blindness. Not distinguishing reversible from irreversible decisions. A reversible decision (which of two debugging paths to try first) can be made quickly; if you are wrong, you back out. An irreversible decision (deploying a destructive migration, sending an external announcement, rolling back a schema change) must not be compressed. Always ask: "if I am wrong, how expensive is the recovery?" This is the same one-way / two-way door distinction that appears at architecture scale in Semester 7.

Anchoring on the first plan. Under pressure, the first plan feels safest because it was already decided. But a worse first plan defended under stress costs more than a better second plan adopted after three minutes of reflection. Timebox your first plan; if it is not producing, switch.

Skipping the write-up. Delivering a partial result verbally, without the assumptions and open items written down. Without the writeup, the next person cannot resume your work; you have traded speed for lost institutional knowledge.

How To Use It

Triage protocol:

  1. Name the deadline explicitly: "decision at 2:30 pm." Not "soon."
  2. List the sub-problems that could produce the decision.
  3. For each, estimate:
    • cost (minutes of focused work)
    • information value (what does solving it tell you that you do not already know?)
    • reversibility (if I act on the answer and I am wrong, how expensive is recovery?)
  4. Order by information per minute, excluding any sub-problem whose required cost exceeds the remaining budget.
  5. Run irreversible decisions last, with the most information.
  6. Reserve the last 10-15% of the budget for writeup: assumptions made, items still open, what would change the conclusion.
  7. Deliver early, not just on time. If the answer is defensible at minute 20 of 30, consider delivering at minute 20 and using the remaining 10 minutes for verification or a follow-up note.

Writeup template for a deadline-bounded answer:

  • What I was asked.
  • What I decided.
  • What evidence supports the decision (1-3 bullets).
  • What I did not check, and why.
  • What would change my conclusion if it came in.

This structure lets a stakeholder trust the decision without having to re-derive it, and lets the next shift pick up the open items.

Transfer / Where This Shows Up Later

  • Semester 2 (algorithms). Choosing a polynomial-time approximation over an exponential-time exact solution when the input is too large is algorithmic triage.
  • Semester 3 (refactoring). "Should I refactor this safely across a month, or patch it in a day to unblock the release?" is a direct triage question.
  • Semester 4 (systems & incidents). Incident response is this concept; the runbook is a triage protocol.
  • Semester 5 (networks/distributed systems). Load shedding, backpressure, circuit breakers: systems-level triage under stress.
  • Semester 7 (architecture). Risk-driven architecture review (see S7 M5) explicitly ranks review effort by risk × cost: triage applied to architectural concerns. Reversibility of decisions is a first-class input.
  • Semester 10 (capstone). Scope triage under deadlines is the capstone's most common failure mode. Students who cannot cut features to hit a date ship nothing; students who can cut cleanly ship a defensible demo.

Check Yourself

  1. When a sub-problem is definitive but slow, what is the correct way to use it inside a short time budget?
  2. Why is distinguishing reversible from irreversible decisions the first triage move, not the last?
  3. What goes into the final 10-15% of the time budget, and why is reserving it non-negotiable?
  4. Give an example where the second-best plan is correctly preferred over the best plan, purely on deadline grounds.
  5. For the 30-minute incident example, suppose the raw-comparison sub-problem (3) would take 40 minutes instead of 20. How does the triage change?

Mini Drill or Application

Drill A. Pick a short real-world or contrived scenario (e.g. "a unit test is failing intermittently, and you have one hour before the release cut"). Write, in the order produced:

  1. the sub-problems you would consider, at least four
  2. for each, the estimated cost, information value, and reversibility
  3. the triage ordering you would use, with a one-sentence justification
  4. the one-paragraph defensible writeup you would deliver if the deadline arrived with only two of the sub-problems checked

If two people do this drill on the same scenario, compare the triage orderings. The differences will expose your instincts about information-per-minute.

Drill B. Apply the take-home-exam schedule in Example 2 to a real upcoming deadline of yours (a project, an exam, a mock interview). Write the schedule before you start the work. Afterwards, compare planned versus actual and note where your estimates were wrong.

Drill C (unseen). You have 10 minutes to decide whether to roll back a deploy. The rollback is reversible but visible to customers; continuing is irreversible (data migration). You have two signals: an error-rate spike (clear, just appeared) and a user-report trickle (ambiguous, only three reports so far). Apply the triage protocol. What do you ship at minute 10?

Read This Only If Stuck