Counting Models, Equally Likely Outcomes, and Event Algebra

What This Concept Is

When every outcome in a finite sample space is equally likely, [ P(A) = \frac{|A|}{|S|}. ] That formula is powerful, but only after the equally-likely structure has been justified. Cluster 1 is about getting the space right; this concept is about working inside a well-chosen space.

Once the model is fixed, event algebra lets you assemble complicated events from simple ones using the set operations you learned in Module 1:

union ( A \cup B ) -- "A or B (or both)"
intersection ( A \cap B ) -- "A and B"
complement ( A^c = S \setminus A ) -- "not A"
difference ( A \setminus B ) -- "A but not B"
symmetric difference ( A \triangle B ) -- "exactly one of A, B"

These are set operations first and probability operations second. The bridge is that ( P ) respects unions of disjoint sets and that ( P(A^c) = 1 - P(A) ).

Three counting primitives from Module 2 show up constantly here:

Permutations: ( n! / (n-k)! ) ordered selections of ( k ) items from ( n ).
Combinations: ( \binom{n}{k} = n!/(k!(n-k)!) ) unordered selections.
Tuples / product rule: ( n^k ) ways to fill ( k ) slots each with ( n ) options independently.

Each primitive corresponds to a distinct sample-space shape: permutations for ordered selections without replacement, combinations for unordered selections, tuples for independent (with-replacement) draws. Pick the one that matches the process, not the one that matches the arithmetic you remember.

The reason this concept exists as its own page is that counting and probability are taught together in almost every textbook, and that coupling hides a trap: a counting formula only becomes a probability when the random process samples each counted configuration uniformly. Importing ( \binom{n}{k} ) into a probability question without checking that assumption is the #1 source of wrong answers in introductory probability.

The naive-probability formula ( |A|/|S| ) is deceptively simple and hides three independent prerequisites:

( S ) is finite.
The weighting on ( S ) is uniform under the generating process.
Your count of ( |A| ) and ( |S| ) is in the same representation.

Violate any one, and the ratio is not a probability. Most real-world errors come from violation 2 (uniformity) or violation 3 (mixed representations), not from counting arithmetic.

Why It Matters Here

Module 2 taught you to count structured sets. This module reuses that skill, but each count now becomes a probability only when the stochastic process supports equal weighting. In CS this matters constantly: hash collisions, birthday paradox bounds, random permutation analysis, fair card shuffles, poker-style combinatorial probability. In all of these, the equally-likely assumption is either true by construction (the hash is drawn uniformly from a hash family) or false in an interesting way (the shuffle is biased, the deck was not well-shuffled).

Getting the event algebra right also matters downstream. In Semester 2, the probability that quicksort picks a "bad" pivot in at least one recursion level is computed by the complement of "every level picks a median-ish pivot." In Semester 8, the probability that a request hits any one of several slow backends is a union, often bounded by a union bound rather than computed exactly. These are event-algebra moves.

A useful mental test before reaching for a counting formula: "under this random process, is each counted object visited with the same probability?" If yes, the naive ratio works. If no, you are either counting the wrong thing or using the wrong representation. This single check catches most textbook-style errors at their source.

Concrete Examples

Example 1 -- Exactly two aces in a 5-card hand. A 5-card hand is dealt uniformly from a standard 52-card deck. Model the outcomes as unordered 5-card hands; then all ( \binom{52}{5} ) hands are equally likely. Favorable hands choose 2 of the 4 aces and 3 of the 48 non-aces: [ P(\text{exactly two aces}) = \frac{\binom{4}{2}\binom{48}{3}}{\binom{52}{5}} = \frac{6 \cdot 17296}{2598960} \approx 0.0399. ] The ratio is valid because the numerator and denominator are both counts of unordered hands. If you mixed ordered sequences in the numerator and unordered hands in the denominator, the ratio would be meaningless even though each factor is arithmetically correct.

Example 2 -- Birthday collision in a 23-person room. Assume 365 equally-likely birthdays, independent across people. The easier event is the complement: all 23 birthdays are distinct. The sample space is the set of length-23 tuples over 365 days, of size ( 365^{23} ). The count of tuples with no repeats is ( 365 \cdot 364 \cdots 343 ). So [ P(\text{no shared birthday}) = \frac{365 \cdot 364 \cdots (365 - 22)}{365^{23}} \approx 0.493, ] and ( P(\text{shared birthday}) \approx 0.507 ). The complement move is what made this tractable; a direct count of "at least one shared birthday" would require inclusion-exclusion and is much uglier. This example reappears verbatim in Semester 2 under hash-collision bounds and in Semester 6 under UUID-collision risk at scale.

Example 3 -- Distinct hash values under ( n ) insertions. Insert ( n ) items into a hash table with ( m ) buckets, each hash uniform and independent. The probability that no two items collide is [ \prod_{i=0}^{n-1} \frac{m - i}{m} \approx \exp!\left(-\frac{n(n-1)}{2m}\right), ] using ( 1 - x \approx e^{-x} ) for small ( x ). Setting this equal to ( 1/2 ) gives the rule of thumb ( n \approx \sqrt{m \ln 2} \approx 0.83 \sqrt{m} ). For ( m = 2^{64} ), collisions become 50/50 around ( n \approx 5 \cdot 10^9 ) -- an essential design number for hash-based identity systems.

Common Confusion / Misconceptions

"Ordered vs unordered is just a choice; both give the same answer." True for the final probability, but only if you are consistent. Mixing ordered in the numerator and unordered in the denominator is a silent bug. Pick one representation, stick with it for both counts.

"If I see ( \binom{n}{k} ), it must be a probability." No. ( \binom{n}{k} ) counts subsets. It becomes a probability only after dividing by the total number of equally-likely outcomes in the same representation.

"I can always just count and divide." Only when the process is uniform on that representation. Card hands from a well-shuffled deck are uniform. The sequence of players' hole cards dealt around a table is also uniform, but on a different space. Pick the one that matches the question.

"Equally likely is obvious by symmetry." Symmetry must come from the generating process, not from the shape of the outcomes. A biased coin yields a perfectly symmetric sample space ( {H, T} ) but non-uniform weights.

"Multiplying counts is always valid." The product rule ( |A \times B| = |A| \cdot |B| ) is valid only when the choices are genuinely independent -- the options for slot 2 do not depend on what was put in slot 1. Draws without replacement violate this; the second-slot count depends on the first.

"Two formulas yielding the same number must both be right." They can agree by coincidence in the specific numbers chosen, and disagree for other values. Always check a second, easy-to-compute case (e.g., ( k = 0 ) or ( k = n )) to distinguish.

"Complement is rare in practice." On the contrary -- nearly every "at least one" question is handled by complement. The phrase is so common that "see 'at least one' -> reach for complement" is a near-automatic reflex.

How To Use It

Use this sequence every time a problem mentions "random":

Define the outcomes. Sequences, sets, or summary counts?
Decide whether those outcomes are equally likely under the process described.
Describe the event as a subset of that space.
Count ( |A| ) and ( |S| ) in the same representation.
If the event is easier by complement or by a disjoint case split, do that first. "At least one" -> complement. "A or B" -> inclusion-exclusion. "Exactly k of n" -> binomial-style count.
Simplify using basic combinatorics from Module 2.
Sanity-check. Is ( P(A) \in [0,1] )? Does the trivial case (( k = 0 ) or ( k = n )) give a sensible value?
Cross-check with a second representation. If you computed a probability using unordered hands, recompute via ordered sequences and confirm the answers agree. Disagreement almost always reveals a subtle counting bug.
When the numbers get big, switch to logs or approximations. Stirling's formula ( n! \approx \sqrt{2\pi n}(n/e)^n ) is the standard tool; for the birthday-style product ( \prod (1 - k/m) ), the approximation ( \exp(-\sum k/m) ) is usually accurate to within a fraction of a percent.

Transfer / Where This Shows Up Later

Semester 2 (Algorithms). Expected comparisons in randomized quicksort are computed by indicator-variable sums over unordered pairs, not ordered pairs. Picking the right representation is a first-order decision.
Semester 2 (Hashing). The probability of a collision in a hash table with ( n ) items and ( m ) slots is ( 1 - \binom{m}{n} n! / m^n ) for the birthday-style model; derivation is exactly Example 2.
Semester 5 (Load balancing). "Probability that some server sees more than ( c \log n / \log \log n ) requests under random routing of ( n ) requests to ( n ) servers" is a classic counting-plus-union-bound argument.
Semester 6 (ID generation). Collision risk for ( k ) random IDs drawn from a universe of size ( N ) is the birthday formula; UUIDv4's ( N = 2^{122} ) makes the collision risk negligible even at planet scale.
Semester 6 (Consistent hashing). Load imbalance under consistent hashing is analyzed via bins-and-balls counting; the spread of "how loaded is the most-loaded shard" is a direct counting-model statement.
Semester 8 (Multi-shard queries). The probability that a user query avoids every one of several slow shards decomposes via event algebra across shards.
Semester 9 (Observability). When you report "the probability that a specific trace was sampled by any of our ( k ) samplers" in head-based vs tail-based sampling, the answer is a union of non-disjoint events, and inclusion-exclusion or the union bound is the right tool.

Check Yourself

Why must the numerator and denominator of a naive probability be counted in the same representation?
When does ( |A \cup B| = |A| + |B| ) hold? What is the penalty for forgetting the intersection term?
Why is "uniformly chosen 5-card hand" different from "uniformly chosen rank pattern"?
You have a population of 1,000,000 user IDs and you draw 50 uniformly at random without replacement. Why can the sampling-with-replacement (binomial-style) approximation be acceptable here but not for drawing 500,000?
Give an example where two sample-space representations (e.g., ordered vs unordered) yield the same probability, and one where the representations are incompatible.
In the ( k )-item hash-collision calculation, explain why the approximation ( \exp(-k(k-1)/(2m)) ) is excellent for ( k \ll \sqrt{m} ) but breaks down as ( k ) approaches ( m ).
Why does the birthday rule of thumb give ( \sqrt{m} )-scale collisions rather than ( m )-scale ones? What does this mean for the bit-length of random IDs you use in production?

Mini Drill or Application

A useful diagnostic before you start: for every problem that mentions "random," speak aloud one sentence of the form "each [outcome shape] is equally likely." If you cannot, stop and re-read the problem -- the sample space you are picking may not actually be uniform under the described process.

For each question, decide whether the sample space should be modeled by ordered sequences or unordered selections, then compute:

3-letter password from distinct letters (26 options) -- probability it contains the letter 'A'.
3-person committee from 10 students -- probability a specific student is included. (Both the unordered count ( \binom{9}{2}/\binom{10}{3} = 3/10 ) and the direct probability argument "one of 10, three slots" give the same answer; this is a useful consistency check.)
Two cards drawn without replacement -- probability both are red.
Poker: probability of a flush (five cards, same suit) from a 52-card deck.
Random ID generation: what is the probability of a duplicate among 1 million randomly generated 64-bit IDs? (Use the birthday approximation; answer is on the order of ( 10^{-8} ), comfortably safe.)
Load balancer: 100 independent requests hashed to 8 buckets. Probability that bucket 1 gets zero requests is ( (7/8)^{100} \approx 1.5 \cdot 10^{-6} ).

Simulation drill. In Python, simulate 50,000 rooms of 23 people with independent uniform birthdays. Estimate the probability of at least one shared birthday and compare to the analytic answer (( \approx 0.507 )). Then repeat for room sizes ( n = 10, 20, 30, 50, 70, 100 ); plot the empirical curve. You should see the hockey-stick shape that makes hash collisions dangerous faster than intuition expects.

Engineering scenario -- UUID sizing. You are picking an ID space for a new event-bus. Partners tell you the volume is "a few billion events per year." Using the birthday rule of thumb, 64 bits (( m = 2^{64} \approx 1.8 \cdot 10^{19} )) gives a 50/50 collision point at ( n \approx 5 \cdot 10^9 ), uncomfortably close to the stated volume. Moving to 128 bits (UUIDv4's ( m = 2^{122} ) after removing reserved bits) pushes the 50/50 point to ( n \approx 2 \cdot 10^{18} ), comfortably safe for any planet-scale system. This is Concept 2 applied in a design review: pick the bit-length so that the birthday-collision curve stays far below any plausible volume, with headroom for a decade of growth.

Read This Only If Stuck

Introduction to Probability: How to count (Part 1)
Introduction to Probability: How to count (Part 2)
Introduction to Probability: How to count (Part 3)
Introduction to Probability: Story proofs
MCS: The Birthday Principle
MCS: Set Theory and Probability
Wikipedia: Inclusion-exclusion principle -- reference for the general union formula used when events overlap.
Wikipedia: Birthday problem -- full derivation, asymptotics, and engineering applications to hashing and cryptographic birthday attacks.
Blitzstein & Hwang, Introduction to Probability -- companion site -- free lecture videos and problem sets that mirror the Chapter 1 counting material referenced above.

What This Concept Is​

Why It Matters Here​

Concrete Examples​

Common Confusion / Misconceptions​

How To Use It​

Transfer / Where This Shows Up Later​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​