Module 2: Sorting, Searching & Fundamental Structures: Case Studies
These cases focus on choosing the right structure for the access pattern, not memorizing isolated operations.
Case Study 1: Stable Sort for Customer Support Queues
Scenario: A support dashboard sorts tickets by priority, then by creation time, then by account tier. The first implementation performs separate sorts but loses creation-time order for tickets with equal priority.
Source anchor: Python Sorting HOWTO.
Module concepts:
- stable sorting
- comparison keys
- multi-key ordering
- correctness tests
Wrong Approach
Write a custom comparator with tangled conditional logic and no examples for equal-key behavior.
Better Approach
Use stable sorting or a composite key deliberately. Specify the ordering contract in examples: higher priority first, earlier creation within same priority, paid tier only as the final tie-breaker.
Tradeoff Table
| Choice | Gain | Cost |
|---|---|---|
| Custom comparator | Maximum control | Easy to violate transitivity |
| Composite key | Clear and fast | Requires careful key direction |
| Multiple stable passes | Easy to reason about | More passes over data |
Failure Mode
Equal-priority tickets reorder unpredictably, making support agents skip older tickets.
Required Artifact
Write an ordering contract with five ticket examples, expected sorted output, and tests for equal keys.
Project / Capstone Connection
Use this contract style for every sorted output in the algorithms project.
Case Study 2: Binary Search and the Off-by-One Incident
Scenario: A pricing service uses binary search to find the active pricing tier for a quantity. Customers buying exactly 100 units are charged the wrong tier because the boundary condition is inconsistent.
Source anchor: The module's binary-search invariant lesson.
Module concepts:
- binary search invariant
- boundary tests
- sorted precondition
- lower-bound and upper-bound semantics
Wrong Approach
Patch the comparison for 100 without naming what the search returns.
Better Approach
Define the invariant and return contract: first tier with minQuantity > quantity, last tier with minQuantity <= quantity, or insertion index. Test empty list, first tier, exact boundary, between tiers, and above last tier.
Tradeoff Table
| Choice | Gain | Cost |
|---|---|---|
| Ad hoc binary search | Small code | Boundary bugs |
| Library bisect/search helper | Tested primitive | Must map semantics correctly |
| Linear scan | Simple behavior | Too slow for large lists |
Failure Mode
The algorithm works for interior values but fails exactly at tier boundaries.
Required Artifact
Produce a binary-search invariant table with lo, hi, excluded region, candidate region, and expected result for boundary inputs.
Project / Capstone Connection
Attach this table to any project binary search or partition-point implementation.
Case Study 3: Queue, Stack, or Deque in a Rate Limiter
Scenario: An API gateway tracks recent requests per user for a sliding-window rate limit. The initial code stores timestamps in a list and removes expired entries from the front on every request.
Source anchor: Python Wiki time complexity.
Module concepts:
- queue operations
- amortized cost
- list vs deque
- access pattern analysis
Wrong Approach
Use a list because appending is fast and the data volume is "small enough."
Better Approach
Match the structure to the operation pattern: append new timestamp at the back, pop expired timestamps from the front, read length. A deque fits this queue behavior; a list shifts elements on front removal.
Tradeoff Table
| Choice | Gain | Cost |
|---|---|---|
| List | Familiar and indexable | Front removals are costly |
| Deque | Efficient front/back operations | Random access is weaker |
| Ring buffer | Predictable memory | More implementation work |
Failure Mode
Traffic spikes create many front removals and the gateway spends time shifting arrays.
Required Artifact
Create an operation-frequency table and justify the chosen data structure with expected operation costs.
Project / Capstone Connection
Use this table before selecting structures for queues, stacks, caches, and sliding windows.
Case Study 4: Hash Table Lookup With Bad Keys
Scenario: A deduplication service stores processed event IDs in a set. Later, events are deduplicated by a mutable object containing id, source, and receivedAt; duplicate detection becomes unreliable.
Source anchor: Python Wiki time complexity, used for hash-table operation expectations.
Module concepts:
- hash table lookup
- equality and hashing
- immutable keys
- average vs worst case
Wrong Approach
Assume set membership is correct because it is normally fast.
Better Approach
Define the identity key explicitly and make it immutable: for example, (source, id). Treat hashing as a contract between equality, immutability, and lookup. Add tests for duplicate IDs, different sources, and late-arriving events.
Tradeoff Table
| Choice | Gain | Cost |
|---|---|---|
| Store full mutable event | Convenient | Membership can become wrong |
| Store immutable key tuple | Reliable lookup | Must define identity carefully |
| Database unique index | Strong persistence guarantee | Adds write-path cost |
Failure Mode
An object changes after insertion, so the set can no longer find the entry reliably.
Required Artifact
Write a key-design note with equality fields, excluded fields, mutability rule, and duplicate-detection tests.
Project / Capstone Connection
Use this note for hash maps, sets, memoization keys, and caches in the project.
Source Map
| Source | Use it for |
|---|---|
| Python Sorting HOWTO | Stable sort and key-function behavior. |
| Python Wiki time complexity | Operation-cost expectations for built-in structures. |