DFS and the Edge Taxonomy
What This Concept Is
Depth-first search explores a graph by going as deep as possible before backtracking. For each vertex it records two timestamps:
d[u]- the discovery time, whenuis first pushed on the DFS stackf[u]- the finish time, whenu's recursion returns
These timestamps produce a parenthesis structure: for any two vertices u and v, the intervals [d[u], f[u]] and [d[v], f[v]] are either nested or disjoint - never crossing. That single structural fact is the engine behind topological sort, SCC decomposition, bridge and articulation-point detection, and dominator trees.
On a directed graph, DFS classifies each edge (u, v) into exactly one of:
- Tree edge -
vwas undiscovered when we explored(u, v). - Back edge -
vis an ancestor ofuin the DFS tree. Indicates a cycle. - Forward edge -
vis a non-child descendant ofu. - Cross edge -
vlies in a separate subtree finished earlier (f[v] < d[u]).
On an undirected graph there are only tree and back edges, because "forward" and "cross" would require a direction the graph does not provide.
The distinction between DFS and BFS is structural, not cosmetic. DFS visits one branch exhaustively before moving on; BFS visits shallowly before deeply. DFS produces finish times and a parenthesis structure; BFS produces levels and shortest-path distances. Knowing which you need is step zero.
Why It Matters Here
Timestamps and edge types are the core of many graph algorithms:
- Cycle detection. A directed graph has a cycle iff DFS finds a back edge.
- Topological sort. On a DAG, vertices in decreasing
f[u]order form a valid topological order. - Strongly connected components. Kosaraju and Tarjan both depend on finish times and the low-link structure DFS produces.
- Bridges and articulation points. Computed from DFS tree plus the
lowfunction over back edges. - Biconnected components, 2-edge-connected components. All DFS-derived.
- Dominator tree (used in SSA form). Built on DFS preorder.
DFS is not a prettier traversal - it is a source of structural information BFS does not provide.
Concrete Example
Pseudocode with global timer:
DFS(G):
time := 0
for each u in V: color[u] := WHITE
for each u in V:
if color[u] = WHITE: DFS-VISIT(u)
DFS-VISIT(u):
color[u] := GRAY
time := time + 1
d[u] := time
for each v in adj[u]:
if color[v] = WHITE:
parent[v] := u
DFS-VISIT(v)
color[u] := BLACK
time := time + 1
f[u] := time
On the directed graph
starting DFS at a:
| step | event | d | f | classification |
|---|---|---|---|---|
| 1 | discover a | d[a]=1 | ||
| 2 | traverse a -> b | d[b]=2 | tree | |
| 3 | traverse b -> c | d[c]=3 | tree | |
| 4 | look at c -> a | back (a is GRAY ancestor) | ||
| 5 | finish c | f[c]=4 | ||
| 6 | finish b | f[b]=5 | ||
| 7 | traverse a -> d | d[d]=6 | tree | |
| 8 | finish d | f[d]=7 | ||
| 9 | finish a | f[a]=8 |
The back edge c -> a exposes the cycle a -> b -> c -> a. No forward or cross edges here.
Common Confusion / Misconceptions
- "DFS edge types are a property of the graph." They are DFS-tree-dependent. Two different DFS starts can classify the same edge differently (forward vs cross). What is invariant is the existence of some back edge iff the directed graph has a cycle.
- "Recursion is mandatory." Iterative DFS with an explicit stack works fine; the only care is pushing children in reverse if you want to match recursive order.
- "DFS discovers shortest paths." No, it does not. DFS can return a long path to a vertex that BFS would reach in one edge. Use DFS for structural info, BFS for distances.
- "Grayness is just bookkeeping." The GRAY color (vertex in progress) is exactly what distinguishes a back edge from a forward or cross edge. Three colors, not two, is the correct discipline.
How To Use It
Default DFS recipe:
- color every vertex WHITE
- for every WHITE vertex, call
DFS-VISIT - during
DFS-VISIT, color it GRAY, recordd[u], recurse into unvisited neighbors, then color BLACK and recordf[u] - classify edges on the fly using color: WHITE target = tree, GRAY = back, BLACK with
d[u] < d[v]= forward, else cross
Use DFS when you need any of: cycle detection, topological sort, SCCs, articulation points, bridges, biconnectivity, or the tree-edge structure itself.
Two implementation notes:
- For deep graphs (
|V| > 10^5) increase your stack size or convert to iterative DFS. Default stack in Python is 1000 frames. - Store
low[u] = min(d[u], d[v] for back edges) over descendantsalongside the DFS; it supports Tarjan's SCC, bridges, and articulation points with no extra pass.
Transfer / Where This Shows Up Later
DFS-derived information shows up almost everywhere structured analysis is needed.
- S3 (systems). Module load order (topological sort via DFS), dynamic-library dependency cycles, garbage-collection mark phase are all DFS in the runtime.
- S4 (compilers). CFG analyses: dominator tree via DFS; loop detection via back edges; irreducibility detection; SSA construction uses the DFS spanning tree.
- S5 (OS). Deadlock detection on the wait-for graph is DFS cycle detection; resource allocation graphs use DFS.
- S6 (databases). DFS over dependency graphs for cascade deletes; query-plan validation uses DFS.
- S7 (DDD). DFS of context maps finds cyclic dependencies fitness functions should flag.
- S8 (distributed). Leader election via DFS spanning trees; flooding-DFS for membership view construction.
Back-link to S1 M2 where recursion is introduced as a proof technique and to S2 M2 where the stack data structure is formalized.
Check Yourself
- Why does the color scheme (WHITE/GRAY/BLACK) give you the edge classification for free?
- What is the parenthesis theorem saying, in one sentence?
- In an undirected graph, why are there only tree and back edges?
- State one corollary of "a directed graph has a cycle iff DFS finds a back edge."
- Why do two different DFS runs from different starting vertices produce different classifications of the same edges, yet agree on whether the graph has a cycle?
- How do you modify DFS to compute
low[u]and detect bridges in one pass?
Mini Drill or Application
On the directed graph
1 -> 2, 1 -> 4
2 -> 3
3 -> 1, 3 -> 5
4 -> 3
5 -> 6
6 -> 4
run DFS starting at vertex 1, record d and f, and classify every edge. Identify any cycles exposed by back edges.
Then implement iterative DFS using an explicit stack and compare its behavior with recursive DFS - the order of neighbor exploration should match if you push in reverse.
Implementation drill. On top of your Graph class, write:
has_cycle(g)using back-edge detectiondfs_times(g)returning(d, f)arraysclassify_edges(g)returning a map(u, v) -> {tree, back, forward, cross}find_bridges(g)for undirected graphs usinglow[u]
Verify each on a hand-drawn 6-vertex graph before running on random inputs.
Read This Only If Stuck
- CLRS 20.3: Depth-First Search
- CLRS 20.3: Depth-First Search (Part 2)
- CLRS 20.3: Depth-First Search (Part 3)
- ADM 5.8: Depth-First Search
- ADM 5.9: Applications of DFS
- ADM 5.10: DFS on Directed Graphs
- Sedgewick: Elementary Graph Algorithms (Part 3)
- Competitive Programming 4.2.7: Edge property check via DFS spanning tree
- Competitive Programming 4.2.8: Articulation points and bridges
- cp-algorithms: Depth-First Search family