Skip to main content

Graph Representation and Traversal Lab

Retrieval Prompts

  1. State the formal definition of a graph G = (V, E), and name the four model dimensions you must commit to before choosing an algorithm.
  2. Write the time and space cost of adjacency list and adjacency matrix for graphs with |V| vertices and |E| edges.
  3. State BFS and DFS from memory: which data structure holds the frontier, which output does each produce?
  4. Name the four DFS edge types on a directed graph and the criterion that picks each.
  5. State one algorithm to compute strongly connected components and say, in one sentence, why reversing edges is part of it.
  6. Give two equivalent characterizations of "this directed graph is a DAG."

Compare and Distinguish

Separate these pairs clearly in your own words:

  • adjacency list versus adjacency matrix (when is each preferable, and why)
  • BFS versus DFS (what each tells you that the other cannot)
  • connected components (undirected) versus strongly connected components (directed)
  • weakly connected versus strongly connected (directed graphs only)
  • Kahn's algorithm versus DFS-based topological sort
  • tree edges versus back edges in DFS (what each implies about the graph)

Common Mistake Check

For each statement, identify the error:

  1. "I'll use BFS on a weighted graph because I want the shortest path."
  2. "Two vertices are in the same SCC if one can reach the other."
  3. "A directed graph is acyclic iff its undirected version is acyclic."
  4. "Adjacency matrix is faster because memory lookups are O(1)."
  5. "DFS always explores children left-to-right; that is part of the algorithm."
  6. "Kahn's algorithm outputs the unique topological order of a DAG."

Mini Application

Do all four tasks for each scenario below:

  1. state the graph model explicitly (V, E, directed?, weighted?, simple?)
  2. choose a representation and justify with a density estimate
  3. pick BFS, DFS, Kosaraju, or Kahn, and justify by target question
  4. state the expected running time

Scenarios:

  1. In a monorepo with 10^4 source files and 10^5 include relationships, produce a build order.
  2. In a 2D grid of 1000 x 1000 cells where each cell may be empty or blocked, count the number of connected empty regions.
  3. In a directed graph of 10^6 Twitter-style "follow" edges, find the mutually-following communities.
  4. In the friendship graph of 10^4 people, compute the shortest number of handshakes between two named users.

Implementation Drill

Pick one of the scenarios above and implement it end to end:

  • a minimal Graph class with the representation you chose
  • a BFS or DFS that produces both the distance array and the parent array
  • a test that verifies your implementation against a brute-force baseline on |V| <= 10

Write 10-15 lines of analysis:

  • what the observed time looked like versus the predicted O(|V| + |E|)
  • where your implementation allocated memory and whether that scaled as predicted
  • any specific bug you made and how you caught it

Evidence Check

This practice page is complete only if you can:

  • draw a small directed graph on paper, run DFS from a chosen start vertex, and correctly classify every edge as tree/back/forward/cross
  • state one case where adjacency matrix is the right choice, with numbers that justify it
  • write pseudocode for Kahn's algorithm from memory and explain how it detects cycles
  • reduce at least one non-graph-looking problem from the scenarios list into an explicit graph instance

Integrated Traversal Katas

Implement all three tasks with both adjacency-list and adjacency-matrix representations when the input size is small enough to compare.

  1. Reachability: return whether target is reachable from start using BFS.
  2. Connected components: label each vertex in an undirected graph by component id.
  3. Path reconstruction: return the actual shortest unweighted path from start to target.

Evidence check:

  • use a queue for BFS and a stack or recursion for DFS
  • include a graph with a cycle, a disconnected graph, and a single-vertex graph
  • write one paragraph explaining why BFS gives shortest unweighted paths and DFS does not promise that