Graph Representation and Traversal Lab
Retrieval Prompts
- State the formal definition of a graph
G = (V, E), and name the four model dimensions you must commit to before choosing an algorithm. - Write the time and space cost of adjacency list and adjacency matrix for graphs with
|V|vertices and|E|edges. - State BFS and DFS from memory: which data structure holds the frontier, which output does each produce?
- Name the four DFS edge types on a directed graph and the criterion that picks each.
- State one algorithm to compute strongly connected components and say, in one sentence, why reversing edges is part of it.
- Give two equivalent characterizations of "this directed graph is a DAG."
Compare and Distinguish
Separate these pairs clearly in your own words:
- adjacency list versus adjacency matrix (when is each preferable, and why)
- BFS versus DFS (what each tells you that the other cannot)
- connected components (undirected) versus strongly connected components (directed)
- weakly connected versus strongly connected (directed graphs only)
- Kahn's algorithm versus DFS-based topological sort
- tree edges versus back edges in DFS (what each implies about the graph)
Common Mistake Check
For each statement, identify the error:
- "I'll use BFS on a weighted graph because I want the shortest path."
- "Two vertices are in the same SCC if one can reach the other."
- "A directed graph is acyclic iff its undirected version is acyclic."
- "Adjacency matrix is faster because memory lookups are
O(1)." - "DFS always explores children left-to-right; that is part of the algorithm."
- "Kahn's algorithm outputs the unique topological order of a DAG."
Mini Application
Do all four tasks for each scenario below:
- state the graph model explicitly (
V, E, directed?, weighted?, simple?) - choose a representation and justify with a density estimate
- pick BFS, DFS, Kosaraju, or Kahn, and justify by target question
- state the expected running time
Scenarios:
- In a monorepo with
10^4source files and10^5include relationships, produce a build order. - In a 2D grid of
1000 x 1000cells where each cell may be empty or blocked, count the number of connected empty regions. - In a directed graph of
10^6Twitter-style "follow" edges, find the mutually-following communities. - In the friendship graph of
10^4people, compute the shortest number of handshakes between two named users.
Implementation Drill
Pick one of the scenarios above and implement it end to end:
- a minimal
Graphclass with the representation you chose - a BFS or DFS that produces both the distance array and the parent array
- a test that verifies your implementation against a brute-force baseline on
|V| <= 10
Write 10-15 lines of analysis:
- what the observed time looked like versus the predicted
O(|V| + |E|) - where your implementation allocated memory and whether that scaled as predicted
- any specific bug you made and how you caught it
Evidence Check
This practice page is complete only if you can:
- draw a small directed graph on paper, run DFS from a chosen start vertex, and correctly classify every edge as tree/back/forward/cross
- state one case where adjacency matrix is the right choice, with numbers that justify it
- write pseudocode for Kahn's algorithm from memory and explain how it detects cycles
- reduce at least one non-graph-looking problem from the scenarios list into an explicit graph instance
Integrated Traversal Katas
Implement all three tasks with both adjacency-list and adjacency-matrix representations when the input size is small enough to compare.
- Reachability: return whether
targetis reachable fromstartusing BFS. - Connected components: label each vertex in an undirected graph by component id.
- Path reconstruction: return the actual shortest unweighted path from
starttotarget.
Evidence check:
- use a queue for BFS and a stack or recursion for DFS
- include a graph with a cycle, a disconnected graph, and a single-vertex graph
- write one paragraph explaining why BFS gives shortest unweighted paths and DFS does not promise that