BFS and Unweighted Shortest Paths

What This Concept Is

Breadth-first search explores a graph from a source vertex s in layers:

layer 0 is {s}
layer k+1 is the set of unvisited neighbors of vertices in layer k

A FIFO queue maintains the frontier. Each vertex is dequeued once; each outgoing edge is examined once. Total work is O(|V| + |E|).

BFS is not just a traversal. On an unweighted graph, the layer containing v is exactly the shortest path distance d(s, v) measured in number of edges. The proof is a two-line induction: when a vertex is first enqueued, its dist is optimal, because any hypothetical shorter path would have enqueued a predecessor earlier and thus enqueued v earlier.

Distinguish BFS sharply from DFS:

BFS uses a FIFO queue, explores layer by layer, and gives shortest-path distances on unweighted graphs. Good for reachability and for distances in hops.
DFS uses a LIFO stack (or recursion), explores as deep as possible first, and gives finish-time and edge-classification information. Good for cycle detection, topological sort, SCCs, bridges.

Swapping the queue for a stack breaks the shortest-path guarantee. BFS and DFS are the same code with one line changed, and they solve utterly different problems.

Why It Matters Here

BFS is the baseline shortest-path algorithm. Before learning Dijkstra or Bellman-Ford, you must be able to say: "this is an unweighted problem, so BFS answers it."

Applications that are really BFS in disguise:

shortest number of hops in a friendship graph (Erdos number, degrees of Kevin Bacon)
shortest number of moves in a puzzle (Rubik's cube, sliding puzzle, word ladder)
reachability from a single source
bipartite-ness test (by coloring layers alternately)
connected components (by running BFS from each unvisited vertex)
0-1 BFS: when weights are {0, 1}, a deque-based BFS still runs in O(|V| + |E|)
Multi-source BFS: push every source into the queue at distance 0; you get distance to the nearest source in O(|V| + |E|)

Concrete Example

Pseudocode, with dist initialized to +infinity except dist[s] = 0:

BFS(G, s):
    Q := queue containing s
    dist[s] := 0
    while Q is not empty:
        u := Q.dequeue()
        for each v in adj[u]:
            if dist[v] = +infinity:
                dist[v] := dist[u] + 1
                parent[v] := u
                Q.enqueue(v)

On the directed graph

BFS from s produces the following trace:

Step	Dequeue	Queue after	Discoveries	`dist` snapshot
init	-	`[s]`	`s:0`	`s=0`
1	`s`	`[a, b]`	`a:1, b:1`	`s=0, a=1, b=1`
2	`a`	`[b, c]`	`c:2` (parent `a`)	`..., c=2`
3	`b`	`[c, d]`	`d:2` (parent `b`)	`..., d=2`
4	`c`	`[d]`	`d` already set - skip	unchanged
5	`d`	`[]`	none	terminal

Final parent[d] = b, so the shortest s -> d path is s -> b -> d.

Common Confusion / Misconceptions

"BFS works on weighted graphs." No. On the graph s -(1)-> a -(1)-> t and s -(10)-> t, BFS might return s -> t at "layer 1" with cost 10, while the weighted shortest path is s -> a -> t with cost 2. BFS counts edges, not weights. For {0, 1} weights, use 0-1 BFS with a deque.
"The queue must hold full paths." No. Store only vertices; reconstruct paths from parent[] at the end.
"BFS equals iterative DFS." They share structure but differ in the one data structure choice. FIFO gives shortest paths; LIFO does not.
"Re-enqueueing a discovered vertex is fine." It is wasteful and, worse, breaks the "first discovery = shortest" invariant if you update dist on the re-enqueue. Guard with the dist[v] = +infinity check before enqueuing.

How To Use It

Default BFS recipe:

initialize dist[v] = +infinity for all v, except dist[s] = 0
push s into a FIFO queue
pop a vertex, relax its unvisited neighbors by setting dist[v] := dist[u] + 1, push them
continue until the queue is empty

Use BFS when:

the graph is unweighted, or
the graph has 0/1 weights (then use a deque: push 0-weight neighbors to front, 1-weight to back), or
you need multi-source shortest distances (push all sources at once), or
you want the BFS tree as a by-product (parents for path reconstruction).

Transfer / Where This Shows Up Later

BFS is embedded in downstream infrastructure far more than its simplicity suggests.

S3 (systems). File-system find with depth limits is BFS; package-manager "what is the shortest dependency chain to X" is BFS on a DAG.
S4 (compilers). Reaching-definitions and liveness analyses use worklist BFS; dominator computations use BFS-like iterative fixed-point.
S5 (OS). Page-cache eviction simulations and process tree walks use BFS.
S6 (databases). BFS over foreign-key graphs answers "which tables are within N hops of this one?"; replication lag propagation simulations use multi-source BFS.
S7 (architecture, DDD). BFS over context maps tells you the blast radius of a change in N hops.
S8 (distributed). Gossip and flooding protocols simulate BFS; the BFS tree is the spanning tree used for broadcast.

Back-link to S1 M2 for the definitions of path length and to S2 M2 for the queue data structure.

Check Yourself

Why does BFS need a FIFO queue instead of a stack, and what happens if you swap?
Prove in one sentence why the first time BFS reaches v, the current dist[v] is optimal.
What is the space cost of BFS in the worst case, and when does it approach O(|V|)?
How do you detect bipartiteness with BFS in O(|V| + |E|)?
How does 0-1 BFS differ from regular BFS, and why does it still achieve O(|V| + |E|)?
Give a case where multi-source BFS is cleaner than |S| separate single-source BFS runs.

Mini Drill or Application

On the directed graph

s -> a, s -> b
a -> c, b -> c
c -> d
b -> d

trace BFS from s: report the discovery order, the layer of each vertex, and parent[d].

Then implement BFS from scratch and use it to:

count connected components in an undirected graph
check whether a graph is bipartite
solve the word-ladder problem on a small dictionary (vertex = word, edge = differs by one letter)
on an unweighted grid with obstacles, find the shortest path from top-left to bottom-right

Time your implementation on a random sparse graph with 10^5 vertices and confirm linear scaling. Then add 0-1 BFS as a deque variant and verify it on a graph with mixed {0, 1} weights.

What This Concept Is​

Why It Matters Here​

Concrete Example​

Common Confusion / Misconceptions​

How To Use It​

Transfer / Where This Shows Up Later​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​