Module 3: Probability & Statistics: Case Studies

These cases focus on uncertainty, measurement, and avoiding conclusions that the data does not support.

Case Study 1: A/B Test With Too Little Data

Scenario: A signup page variant gets 12 conversions out of 100 visitors while the old page gets 10 out of 100. The team wants to ship immediately.

Source anchor: CMU notes on probability for randomized algorithms.

Module concepts:

random variation
sample size
confidence
decision risk

Wrong Approach

Declare the variant better because the observed conversion rate is higher.

Better Approach

Estimate uncertainty, define the minimum effect worth detecting, and avoid overreading small samples. Decide whether to collect more data or make a low-risk product decision.

Tradeoff Table

Choice	Gain	Cost
Ship immediately	Fast iteration	High false-positive risk
Collect more data	Better evidence	Slower decision
Use prior/product judgment	Practical	Less purely statistical

Failure Mode

The team ships noise and later sees conversion return to baseline.

Required Artifact

Write an experiment decision memo with observed rates, uncertainty concern, practical significance, and next action.

Project / Capstone Connection

Use this memo structure later whenever a project claim depends on measured improvement rather than intuition.

Case Study 2: Expected Value in Retry Costs

Scenario: A client retries failed requests up to three times. Engineers count only successful user experience and ignore extra backend load.

Source anchor: CMU notes on probability for randomized algorithms.

Module concepts:

expected value
independent trials
tail behavior
cost modeling

Wrong Approach

Assume retries are free because each individual retry is fast.

Better Approach

Model expected attempts per request using failure probability, then estimate load under normal and degraded conditions. Add caps, jitter, and observability around retry storms.

Tradeoff Table

Choice	Gain	Cost
Aggressive retries	Better chance of success	Amplifies outages
Capped retries	Limits load	Some requests fail sooner
Backoff with jitter	Reduces synchronization	More logic

Failure Mode

During an outage, retries multiply traffic and make recovery slower.

Required Artifact

Calculate expected attempts for failure probabilities 1%, 10%, and 50%, and write a retry policy note.

Project / Capstone Connection

Bring this expected-value note into later reliability work so retry logic is justified against system load, not only success odds.

Case Study 3: Misleading Average Latency

Scenario: A service reports average latency of 120 ms, but some users experience multi-second delays. The dashboard hides tail latency.

Source anchor: CMU notes on probability for randomized algorithms.

Module concepts:

mean vs distribution
percentiles
sampling
outliers

Wrong Approach

Track only the average and call the service healthy.

Better Approach

Inspect the distribution: median, p90, p95, p99, and outlier causes. Match the metric to user experience and alert on tail behavior where appropriate.

Tradeoff Table

Choice	Gain	Cost
Mean only	Simple number	Hides tails
Percentiles	Better user signal	Requires more data care
Full histogram	Rich diagnosis	More storage and analysis

Failure Mode

The service looks healthy while high-value users hit slow paths.

Required Artifact

Produce a latency summary with mean, median, p95, p99, and one hypothesis for tail behavior.

Project / Capstone Connection

Reuse this latency summary format in observability dashboards and incident reviews during later systems and production semesters.

Case Study 4: Base Rate Neglect In Fraud Alerts

Scenario: A fraud detector flags 95% of fraudulent transactions, but only 0.2% of all transactions are actually fraud. The team assumes a flagged transaction is almost certainly fraudulent.

Source anchor: Khan Academy on expected value and probability is a practical anchor for reasoning from probabilities instead of intuition alone.

Module concepts:

base rate
conditional probability
false positives
decision thresholds

Wrong Approach

Judge the alert system only by its detection rate.

Better Approach

Combine detector accuracy with the base rate of fraud. Ask how many flagged transactions are true fraud, how many are false positives, and what operational cost review teams can absorb.

Tradeoff Table

Choice	Gain	Cost
aggressive flagging	catches more fraud	more manual review noise
conservative threshold	fewer false positives	misses some fraud
base-rate analysis	realistic operating picture	requires better probability literacy

Failure Mode

The team overwhelms reviewers and frustrates good customers because it ignored how rare fraud is in the full population.

Required Artifact

Write a confusion-matrix note for 100,000 transactions with stated fraud prevalence, detector recall, and false-positive rate.

Project / Capstone Connection

Use this note later when evaluating alerts, classifiers, monitoring thresholds, or any project metric that depends on rare events.

Case Study 5: Median Salary Looks Fine, But The Distribution Is Skewed

Scenario: A bootcamp reports a median graduate salary that looks healthy, but the underlying outcomes vary widely across regions, experience levels, and a small number of unusually high offers.

Source anchor: NIST Engineering Statistics Handbook: Percentiles is a good anchor for thinking beyond a single summary number when describing a distribution.

Module concepts:

median
percentiles
skewed distributions
summary choice

Wrong Approach

Assume one central statistic is enough to describe the whole outcome picture.

Better Approach

Report a small distribution summary: median plus p25 and p75, or another percentile range appropriate to the decision. Explain what kind of variation the summary hides and who might be affected.

Tradeoff Table

Choice	Gain	Cost
single summary number	easy communication	hides spread and skew
percentile range	better outcome picture	more explanation needed
full histogram	richest detail	heavier to present

Failure Mode

Stakeholders make decisions from a clean-looking headline statistic while ignoring the real spread of outcomes.

Required Artifact

Write a one-page metric summary that includes median, percentile range, and a short note on what the distribution shape implies.

Project / Capstone Connection

Reuse this summary style later for latency, compensation, experiment, or survey metrics whenever a single mean or median would hide too much.

Source Map

Source	Use it for
CMU probability notes	Expected value, randomized reasoning, and uncertainty vocabulary for experiments, retries, and decision risk.
Khan Academy expected value	Accessible reinforcement for probability-based decisions when intuition alone is likely to mislead.
NIST Engineering Statistics Handbook: Percentiles	Supporting percentile-based summaries and discussions of spread when averages or medians alone hide user experience.

Case Study 1: A/B Test With Too Little Data​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Case Study 2: Expected Value in Retry Costs​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Case Study 3: Misleading Average Latency​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Case Study 4: Base Rate Neglect In Fraud Alerts​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Case Study 5: Median Salary Looks Fine, But The Distribution Is Skewed​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Source Map​

Case Study 1: A/B Test With Too Little Data

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Case Study 2: Expected Value in Retry Costs

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Case Study 3: Misleading Average Latency

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Case Study 4: Base Rate Neglect In Fraud Alerts

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Case Study 5: Median Salary Looks Fine, But The Distribution Is Skewed

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Source Map