Module 5: Advanced Data Structures & Algorithmic Tradeoffs: Case Studies

Advanced structures earn their place when a workload needs their guarantees. These cases force the choice through operations, invariants, and failure modes.

Case Study 1: Priority Queue for Job Scheduling

Scenario: A worker service runs background jobs by priority and retry time. The first implementation scans the full list of pending jobs every second to find the next job.

Source anchor: Python heapq documentation.

Module concepts:

heaps
priority queues
comparator keys
lazy deletion

Wrong Approach

Keep a list because scanning is straightforward and the queue is usually small.

Better Approach

Use a heap keyed by (runAt, priority, sequenceNumber) so fetching the next runnable job is efficient and stable for ties. Handle cancellation with lazy deletion or an index map, and test tie-breaking explicitly.

Tradeoff Table

Choice	Gain	Cost
Linear scan	Simple code	Cost grows with pending jobs
Heap	Efficient next-job selection	Update/delete needs care
Indexed priority queue	Better updates	More implementation complexity

Failure Mode

At peak traffic, every worker repeatedly scans thousands of delayed jobs.

Required Artifact

Create an operation table for insert, peek, pop, cancel, reprioritize, and explain how the chosen structure handles each one.

Project / Capstone Connection

Use this table before introducing any advanced structure in the project.

Case Study 2: Balanced Tree for Range Queries

Scenario: An analytics feature must answer "show all orders between two timestamps" and also insert late-arriving orders. A hash map by order ID makes exact lookup fast but does not support ordered range scans.

Source anchor: MIT OpenCourseWare balanced binary search trees lecture.

Module concepts:

balanced binary search trees
ordered maps
range queries
invariant maintenance

Wrong Approach

Store everything in a hash map and sort all timestamps whenever a range query arrives.

Better Approach

Use an ordered structure such as a balanced tree or database index when range queries are core operations. Maintain ordering on insert, then answer ranges by seeking to the lower bound and scanning until the upper bound.

Tradeoff Table

Choice	Gain	Cost
Hash map	Fast exact lookup	Poor ordered scans
Sort on demand	Easy to add	Repeats sorting work
Balanced tree/index	Efficient ordered operations	More complex invariants

Failure Mode

Range dashboards slow down as data grows because every request performs a fresh full sort.

Required Artifact

Write an access-pattern matrix comparing exact lookup, insert, delete, range scan, and ordered iteration.

Project / Capstone Connection

Use this matrix for choosing between arrays, maps, heaps, trees, and indexes.

Case Study 3: Database Index as a Data Structure Choice

Scenario: A product search endpoint filters by category, price, and created_at. Developers add application-side filtering after loading thousands of rows because the database query seems simpler.

Source anchor: SQLite query planner documentation.

Module concepts:

indexes as data structures
selectivity
composite index order
query-plan inspection

Wrong Approach

Assume database performance is outside algorithm analysis.

Better Approach

Treat the index as a search structure maintained by the database. Inspect the query plan, choose indexes based on equality filters, range filters, and ordering needs, then benchmark representative queries.

Tradeoff Table

Choice	Gain	Cost
App-side filtering	Simple query	Moves work and data unnecessarily
Single-column indexes	Easy to add	May not match compound query
Composite index	Fits workload	Extra write and storage cost

Failure Mode

The endpoint works in tests but times out when a category contains millions of products.

Required Artifact

Produce a query-plan note with workload, proposed index, expected seek/scan behavior, and before/after measurements.

Project / Capstone Connection

Use this note when explaining real-world uses of trees, hash indexes, and ordered search.

Case Study 4: Trie for Autocomplete

Scenario: A command palette autocompletes command names. The first version scans every command for every keystroke. It works with 200 commands but becomes sluggish with thousands of plugin commands.

Source anchor: The module's tries and prefix-search concepts.

Module concepts:

trie
prefix search
memory tradeoffs
ranking layer separation

Wrong Approach

Optimize the full scan with micro-tweaks before changing the access pattern.

Better Approach

Use a trie or prefix index for candidate retrieval, then apply ranking separately. Keep payloads small at trie nodes, define update behavior for plugin install/uninstall, and benchmark memory as well as latency.

Tradeoff Table

Choice	Gain	Cost
Full scan	Simple and flexible	Work grows with all commands
Trie	Fast prefix lookup	Higher memory and update complexity
Search index	Rich ranking	More infrastructure

Failure Mode

Autocomplete latency grows linearly with total commands instead of matching prefix selectivity.

Required Artifact

Create a prefix workload table with command count, average prefix length, memory estimate, lookup latency, and update requirements.

Project / Capstone Connection

Use this workload table for any prefix, dictionary, or autocomplete feature.

Source Map

Source	Use it for
Python heapq documentation	Heap and priority-queue behavior.
MIT OpenCourseWare balanced binary search trees lecture	Balanced tree invariants and ordered operation reasoning.
SQLite query planner documentation	Database indexes as practical search structures.

Case Study 1: Priority Queue for Job Scheduling​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Case Study 2: Balanced Tree for Range Queries​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Case Study 3: Database Index as a Data Structure Choice​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Case Study 4: Trie for Autocomplete​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Source Map​

Case Study 1: Priority Queue for Job Scheduling

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Case Study 2: Balanced Tree for Range Queries

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Case Study 3: Database Index as a Data Structure Choice

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Case Study 4: Trie for Autocomplete

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Source Map