Dijkstra's Algorithm

What This Concept Is

Dijkstra solves single-source shortest paths on a graph with nonnegative edge weights. It maintains a tentative distance array dist[] and a set S of vertices whose distances are already final, and grows S one vertex at a time:

for each v in V: dist[v] := +infinity
dist[s] := 0
Q := priority queue of (dist[v], v) for all v
while Q is not empty:
    u := Q.extract-min()
    for each edge (u, v, w) in adj[u]:
        if dist[u] + w < dist[v]:
            dist[v] := dist[u] + w
            parent[v] := u
            Q.decrease-key(v, dist[v])

The central operation is relaxation: for edge (u, v, w), if going through u improves dist[v], update it.

Dijkstra's structural shape is "greedy BFS with a priority queue" - pick the closest unfinalized vertex, finalize it, relax its out-edges. Contrast with BFS (FIFO, unit weights), Bellman-Ford (relax every edge |V|-1 times, tolerates negative weights), and Prim (same frontier shape but key[v] stores "best edge to tree" instead of "best distance from source").

Why It Matters Here

Dijkstra is the default shortest-path algorithm whenever weights are nonnegative. Its running time depends on the priority queue:

Binary heap. O((|V| + |E|) log |V|) - the usual production choice.
Fibonacci heap. O(|E| + |V| log |V|) - theoretically optimal, rarely implemented.
Array-based PQ. O(|V|^2) - best on dense graphs where |E| ~ |V|^2.
Indexed binary heap / decrease-key heap. Matches binary heap asymptotically but with lower memory overhead than lazy deletion.

On sparse graphs, the binary-heap bound beats Bellman-Ford's O(|V||E|) by a large factor. That is why Dijkstra owns routing, navigation, and shortest-path-in-weighted-graph interview questions.

Concrete Example

Graph:

Trace:

step	extract	`dist[s]`	`dist[a]`	`dist[b]`	`dist[t]`	notes
init	-	`0`	`inf`	`inf`	`inf`
1	`s`	`0`	`4`	`1`	`inf`	relax `s->a`, `s->b`
2	`b`	`0`	`3`	`1`	`6`	relax `b->a` (4->3), `b->t`
3	`a`	`0`	`3`	`1`	`6`	`a->t` tied (3+3=6), no change
4	`t`	`0`	`3`	`1`	`6`	done

Final dist[t] = 6 via path s -> b -> a -> t. The shortest-path tree is s -> b -> a -> t, and parent[t] = a, parent[a] = b, parent[b] = s.

Common Confusion / Misconceptions

"Dijkstra handles negative edges if there is no negative cycle." No. The invariant when u is extracted, dist[u] is final relies on nonnegativity. A single negative edge can cause a vertex already extracted to be reachable at a lower cost later, and Dijkstra will not revisit it.
"decrease-key is free." In a standard binary heap, decrease-key requires locating the entry first (linear scan) or maintaining an index alongside the heap. The "lazy deletion" trick - push a new (dist, v) entry and skip stale pops - keeps the log factor at the cost of extra memory.
"A is a different algorithm."* A* is Dijkstra with a consistent heuristic baked into the key. Dijkstra is A* with heuristic h = 0.
"Visited = finalized." Be careful with the term "visited." In the lazy-deletion variant, a vertex can be pushed multiple times; only the first pop is the finalization.
"Priority queue = heap." Any heap-like structure works, and for dense graphs an unordered array is actually faster per-operation. Pick based on density.

How To Use It

Use the binary-heap variant as your default:

Represent Q as a binary heap of (distance, vertex) pairs.
On relaxation, push the new pair rather than decrease-key; ignore stale pops.
Track parent[v] alongside dist[v] to reconstruct the shortest path.
Early-exit when t is extracted if you only need the single pair s -> t.

Implementation skeleton (lazy deletion):

heap := [(0, s)]; dist[s] := 0
while heap not empty:
    (d, u) := heap.extract-min()
    if d > dist[u]: continue          # stale
    for each (u, v, w) in adj[u]:
        if dist[u] + w < dist[v]:
            dist[v] := dist[u] + w
            parent[v] := u
            heap.push((dist[v], v))

Use Dijkstra when and only when:

all edge weights are >= 0, and
you need single-source (or single-pair) shortest paths, and
you need additive path cost.

Transfer / Where This Shows Up Later

S3 (systems). OS network stack routing, traceroute-like utilities, and shortest-path routing in overlays.
S4 (compilers). Optimum decomposition of a DAG along a cost metric uses Dijkstra or its longest-path dual.
S5 (OS). Priority-queue-driven scheduling is Dijkstra's extract-min applied to task priorities with dynamic key updates.
S6 (databases). Graph databases expose dijkstraShortestPath queries; query optimizers use Dijkstra over plan graphs with estimated costs as weights.
S7 (DDD, architecture). Failure-propagation analysis on a service graph uses Dijkstra with expected-latency weights; A* is Dijkstra on a topology with a heuristic.
S8 (distributed). OSPF's link-state routing is Dijkstra run by every router on a synchronized view of the network.

Back-link to M4 greedy algorithms (Dijkstra is the textbook illustration of the greedy-choice proof) and to the "shortest path variants" concept (09) for when Dijkstra is the wrong choice.

Check Yourself

State the invariant that makes Dijkstra correct and show where nonnegativity is used.
Why does a binary heap give O((|V| + |E|) log |V|) and a Fibonacci heap O(|E| + |V| log |V|)?
What is the lazy-deletion trick for a binary-heap Dijkstra?
For a dense graph with |E| = Theta(|V|^2), which PQ beats the heap variant?
Why does the shortest-path tree have |V|-1 edges (or fewer if not all vertices reachable)?
Explain why A* reduces to Dijkstra when the heuristic is zero, and why consistency of the heuristic matters.

Mini Drill or Application

On the graph

-(7)-> 2
-(9)-> 3
-(14)-> 6
-(10)-> 3
-(15)-> 4
-(11)-> 4
-(2)-> 6
-(6)-> 5
-(9)-> 5

run Dijkstra from vertex 1. Record the order of vertex extractions and final dist[].

Implementation drill.

Implement binary-heap Dijkstra with lazy deletion in your language of choice.
Run it on a grid graph of 1000 x 1000 with random [1, 10] weights; time it.
Add path reconstruction via parent[].
Extend to A* with an admissible Euclidean-distance heuristic on the grid; compare node-expansion counts.
Re-run on a graph with one negative edge; observe that Dijkstra silently returns the wrong answer (this is the teaching moment of why Bellman-Ford exists).

What This Concept Is​

Why It Matters Here​

Concrete Example​

Common Confusion / Misconceptions​

How To Use It​

Transfer / Where This Shows Up Later​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​