Iterator: Decoupling Traversal from Container
What This Concept Is
Iterator lets a client walk through the elements of a collection without knowing how the collection is organized. The container exposes some way to produce an iterator; the iterator exposes two or three operations (hasNext, next, optionally reset).
Three roles:
- Aggregate (collection) -- the data structure. Exposes
createIterator()or is itself iterable. - Iterator -- carries traversal state (current position) and produces elements one at a time.
- Client -- uses iterators as interchangeable cursors.
The point is not "add a for loop" -- every language has that. The point is that the client code is the same whether the underlying structure is an array, a linked list, a tree, or a paginated API.
Why It Matters Here
Most collections you implement or wrap have internal representation you do not want clients to see. Leaking it means clients assume indexing or ordering and break when you change the data structure.
Iterator is also the pattern behind language-level iterables, generators, streams, and LINQ. Understanding it at the object level lets you see those features as instances of the same idea.
Concrete Example
A tree that supports two traversal orders via distinct iterators.
from typing import Iterator, Callable
class TreeNode:
def __init__(self, value, children=None):
self.value = value
self.children = children or []
def depth_first(root: TreeNode) -> Iterator[int]:
stack = [root]
while stack:
node = stack.pop()
yield node.value
stack.extend(reversed(node.children))
def breadth_first(root: TreeNode) -> Iterator[int]:
from collections import deque
q = deque([root])
while q:
node = q.popleft()
yield node.value
q.extend(node.children)
class Tree:
def __init__(self, root): self.root = root
def traverse(self, order: Callable = depth_first) -> Iterator[int]:
return order(self.root)
root = TreeNode(1, [TreeNode(2, [TreeNode(4), TreeNode(5)]), TreeNode(3)])
tree = Tree(root)
print(list(tree.traverse(depth_first))) # [1, 2, 4, 5, 3]
print(list(tree.traverse(breadth_first))) # [1, 2, 3, 4, 5]
Two generators, one collection, zero client changes when adding a third traversal order. Python generators are iterators -- the pattern, with less boilerplate.
A classical class-based iterator in Java looks like this:
interface TreeIterator { boolean hasNext(); int next(); }
// depthFirst and breadthFirst are two implementations.
Same roles; more code; same shape.
Common Confusion / Misconception
- Exposing the cursor. If the iterator is really just the integer index of an underlying array, clients will reach past it. Keep iterator state inside the iterator.
- Single-use iterators. Most iterators are exhausted by one walk. Document this. If you need re-iteration, return a fresh iterator, not the consumed one.
- Concurrent modification. Walking a collection while another thread (or the same thread) mutates it usually corrupts the traversal. Either snapshot on iterator creation or fail fast (Java's
ConcurrentModificationException). - Forcing a class when a generator exists. Python/JavaScript/Kotlin generators are idiomatic. Use them. The class form is for languages without first-class generators or when the iterator needs rich lifecycle.
- Iterator as a collection.
list(iter)materializes it. For very large or infinite sources, that is the bug that kills the machine.
How To Use It
- Decide the minimum interface your clients need. Usually "advance" and "is there more".
- Put traversal state inside the iterator, not the collection.
- Keep the collection responsible for producing iterators; do not split that factory across files.
- For multiple orders, produce different iterator types (or generator functions). Do not add
ifsin a single iterator. - Document whether the iterator snapshots or reflects live changes.
- If traversal is expensive or paginated, make the iterator lazy -- fetch on demand.
Check Yourself
- What does a client gain when traversal is an iterator rather than a loop over
list.get(i)? - Why should iterator state live in the iterator, not the collection?
- When is a generator preferable to a class iterator?
- What should happen if the collection changes while an iterator is walking it?
Mini Drill or Application
Implement a paginated API iterator:
- Given a function
fetchPage(n)returning{items, hasNext}, produce an iterator that walks all items across pages. - Fetch lazily: do not fetch page
n+1until the client has consumed all of pagen. - Write a consumer that terminates after reading the first 10 items and shows that no later pages were fetched.
Stop when the client loop never mentions page numbers.