Adjacency List vs Adjacency Matrix
What This Concept Is
A graph G = (V, E) can be stored in two standard ways, plus a few hybrids you should recognize:
- Adjacency list. For each vertex
u, keep a container (array, linked list, or hash set) of its neighbors. Total size isTheta(|V| + |E|). - Adjacency matrix. A
|V| x |V|arrayAwithA[u][v] = 1(or the edge weight) if(u, v)is an edge, else0(or+infinity). Total size isTheta(|V|^2). - Edge list. A plain array of
(u, v, w)triples. Great for Kruskal and Bellman-Ford; terrible for neighbor lookup. - CSR (compressed sparse row). Two flat arrays -
offsets[|V| + 1]andtargets[|E|]- giving cache-friendly sequential access. Used in high-performance graph libraries (Boost.Graph, NetworkX's SciPy backend, GAP).
They are informationally equivalent. They are not operationally equivalent, and the choice silently fixes the running time of every algorithm on top.
Why It Matters Here
The representation choice silently determines the running time of everything that follows.
- BFS and DFS visit each vertex and each edge once. With a list, they run in
O(|V| + |E|). With a matrix, they run inO(|V|^2)because finding neighbors ofuforces you to scan a full row. - Dijkstra with a binary heap is
O((|V| + |E|) log |V|)on a list, but a matrix version costs at leastO(|V|^2)from the neighbor scan alone. - Checking "is
(u, v)an edge?" isO(1)on a matrix, butO(deg(u))on a list. - Floyd-Warshall is defined in terms of a matrix - the inner
dist[i][k] + dist[k][j]access is what makes itO(|V|^3)with a tight constant.
So the rule is not "lists are better." The rule is: match the representation to the density of the graph and the operations the algorithm needs.
Concrete Example
A social graph on |V| = 10^6 users where each user has about 200 friends.
|E|is about10^8.- Adjacency list:
Theta(|V| + |E|) ~= 10^8edge cells. Feasible. - Adjacency matrix:
Theta(|V|^2) = 10^{12}cells. Infeasible, regardless of algorithm.
By contrast, a 500-vertex dense graph where every pair is connected has |E| ~ 125{,}000. The matrix is 250{,}000 cells - smaller than the list representation with its per-entry overhead. Here the matrix is the right choice, and algorithms like Floyd-Warshall are written as if the matrix is the only input.
A pocket example to trace:
| Representation | Storage for this 4-node graph | has_edge(1, 3) cost | neighbors(1) cost |
|---|---|---|---|
| Adjacency list | 5 edge entries plus 4 vertex headers | O(deg(1)) = 2 | O(deg(1)) = 2 |
| Adjacency matrix | 16 cells (4 x 4) | O(1) | `O(4) = O( |
| Edge list | 5 triples | `O( | E |
Tiny graphs make the constants obvious, but the |V|-vs-|V|^2 scaling is where the real choice is made.
Common Confusion / Misconceptions
- "Many edges = dense graph." Density ratio
|E| / |V|^2, not raw edge count, decides.10^8edges on10^6vertices is still extremely sparse (ratio10^{-4}). - "Matrices are wasteful." Not if the graph is actually dense; the matrix row scan is cache-friendly and the code is short. Floyd-Warshall tightens to a few CPU cycles per iteration on a matrix.
- "Hash-set adjacency is always best." Hash sets give
O(1)has_edge, but they cost more memory and cache misses than a sorted list. Use them only whenhas_edgeis the inner loop. - "I have to pick one." Many production systems maintain both: a CSR for traversal, plus a bitset or hash for edge-existence. The cost of duplication is often less than the cost of a wrong asymptotic.
How To Use It
Pick a representation by asking three questions:
- What is the density
|E| / |V|^2? If it is near1, prefer matrix. If it is small, prefer list. - Which operation dominates? Edge-existence queries favor matrices or hash sets; neighbor enumeration favors lists.
- What are the memory constraints?
|V|^2must actually fit.
As a default in this module: use adjacency lists unless the algorithm (Floyd-Warshall, transitive closure, matrix-multiplication shortest paths) explicitly demands a matrix.
Engineering checklist when you implement:
- give each vertex an integer id
0..|V|-1; hash maps for string keys belong in a separate layer - store weights inside the adjacency entry (
struct { int to; int w; }) not in a parallel array - if the graph is undirected, insert each edge twice; budget
2|E|memory accordingly - for directed graphs, often useful to store the reverse adjacency as well (needed for Kosaraju, transposed BFS)
Transfer / Where This Shows Up Later
Representation choice drives performance in every downstream system that stores a graph.
- S3 (systems). CPU cache behavior makes CSR dominant in production BFS/DFS. The linker's module-dependency graph is held as a CSR.
- S4 (compilers). Control flow graphs use adjacency lists (small, sparse, few predecessors). Interference graphs for register allocation use adjacency matrices when
|V|is small (a single function), lists otherwise. - S5 (OS). Scheduler's wait-for graph is dense-ish in small processes; bitmasks (a compressed matrix) are the idiomatic representation.
- S6 (databases). Query planners keep plan DAGs as adjacency lists; join dependency graphs are matrix-like when the planner uses dynamic programming bitmasks.
- S7 (DDD). Context maps and service dependency graphs drive fitness functions whose implementations scan adjacency lists.
- S8 (distributed). Service meshes and replication topologies expose neighbor queries; NetworkX / Boost.Graph internals use CSR.
Revisit S1 M2 if the |V| + |E| vs |V|^2 comparison still feels abstract - the space-complexity reasoning is the same one you met for set representations.
Check Yourself
- Why is BFS on an adjacency matrix
O(|V|^2)instead ofO(|V| + |E|)? - When would an adjacency set (hash set per vertex) beat both a list and a matrix?
- For a weighted graph, how do each of the two representations store weights?
- Why does
|E| = O(|V|^2)for simple graphs, and why does that matter for asymptotic comparisons? - What does CSR buy you over a list-of-lists that a plain
vector<vector<int>>does not?
Mini Drill or Application
For each graph, recommend a representation and justify with a density estimate:
- US road network:
|V| ~= 2 x 10^7, average degree~ 3. - Flight network between major airports:
|V| ~= 10^3, average degree~ 50. - Random Erdos-Renyi
G(100, 0.5): expected|E| ~= 2500. - Webcrawl of top-level domain:
|V| ~= 10^9,|E| ~= 10^{11}. - Petersen graph:
|V| = 10,|E| = 15.
Then implement a minimal Graph class in your language of choice that supports add_edge, neighbors(u), and has_edge(u, v), once with a list and once with a matrix, and benchmark neighbor iteration on a random sparse graph of |V| = 10^5, average degree 10. Measure cache misses if your profiler supports it.
Read This Only If Stuck
- CLRS 20.1: Representations of Graphs
- CLRS 20.1: Representations of Graphs (Part 2)
- CLRS VI: Graph Algorithms (introduction)
- ADM 5.1: Flavors of Graphs
- Sedgewick: Elementary Graph Algorithms (Part 1)
- Competitive Programming 4.1: Overview and motivation
- cp-algorithms: Breadth-First Search (representation note)
- Sedgewick & Wayne, algs4 booksite: Graphs