Loops and Summations Turn Code Into a Sum
What This Concept Is
Iterative complexity analysis is a translation exercise. You turn the code into a summation whose index is the loop variable and whose summand is the cost of one iteration, then you evaluate the sum and take the $\Theta$-class.
Three summations carry most of it:
- $\sum_{i=1}^{n} 1 = n = \Theta(n)$
- $\sum_{i=1}^{n} i = \frac{n(n+1)}{2} = \Theta(n^2)$
- $\sum_{i=0}^{k-1} 2^i = 2^k - 1 = \Theta(2^k)$ (geometric, doubling)
- $\sum_{i=1}^{n} \frac{1}{i} = H_n = \Theta(\log n)$ (harmonic - shows up in quicksort, hashing probes, disjoint-set analysis)
Nested loops multiply: the outer loop contributes the outer sum, the inner loop contributes the inner sum, and dependent bounds (inner loop's range depends on the outer index) mean the inner sum ranges over the current outer index and you get a triangular, not rectangular, count.
The total cost of any straight-line loop nest is:
$$T(n) = \sum_{\text{outer}} \left( \text{work before inner} + \sum_{\text{inner}} (\text{body cost}) + \text{work after inner} \right).$$
If the body's cost itself depends on $n$ (because it calls a function, allocates, or touches a growing data structure), substitute that expression inside the sum, never outside.
Why It Matters Here
Most "obvious" complexity mistakes happen here. Analysts either:
- assume all loops are independent when they are not (squaring instead of triangulating),
- forget that work inside the loop may itself be $\Theta(n)$ rather than $\Theta(1)$ (hidden string concatenation,
list.copy(),x in list), - or quote $n^2$ for a doubly-nested loop without checking the inner bound (an inner loop that runs $\log i$ times gives $\Theta(n \log n)$, not $\Theta(n^2)$).
Getting to $\Theta$ instead of a conservative $O$ requires seeing the sum exactly. Nearly every tight bound in CLRS chapters 2-4 is the result of evaluating an explicit summation.
Concrete Example
Example 1 (triangular). Consider:
total = 0
for i in range(n):
for j in range(i):
total += 1
The inner loop runs $i$ times when the outer index is $i$. Total work:
$$\sum_{i=0}^{n-1} i = \frac{(n-1)n}{2} = \Theta(n^2).$$
Not exactly $n^2$. Not $n \cdot n$. The exact bound is $n(n-1)/2$; the asymptotic shape is $\Theta(n^2)$. This is the canonical "pair of indices" cost for comparison-based algorithms and is why selection sort, insertion sort worst case, and naive closest-pair are all $\Theta(n^2)$.
Example 2 (logarithmic).
i = 1
while i < n:
do_constant_work()
i *= 2
The loop index doubles each iteration, so the number of iterations $k$ satisfies $2^k \ge n$, i.e. $k = \lceil \log_2 n \rceil = \Theta(\log n)$. Multiplicative growth in the index produces logarithmic loop counts. The same template gives the analysis of binary search, tournament trees, and skip-list levels.
Example 3 (harmonic). Counting the work of a loop that does $n/i$ work at step $i$:
$$\sum_{i=1}^{n} \frac{n}{i} = n \sum_{i=1}^{n} \frac{1}{i} = n \cdot H_n = \Theta(n \log n).$$
This is the cost of the sieve-of-Eratosthenes inner loop, of union-find without compression, and of the classic "for each $i$, iterate over multiples of $i$" pattern.
Common Confusion / Misconceptions
- "Two nested loops equal $n^2$." Not always. If the inner loop runs a fixed number of times independent of $n$, the whole thing is $\Theta(n)$. If the inner loop depends on the outer index, the correct bound comes from a triangular sum, not a square.
- "Constant work per iteration." A
s = s + "x"inside a loop on a Python string is $\Theta(\text{len}(s))$ per iteration because strings are immutable. An $n$-step loop silently becomes $\Theta(n^2)$. Same trap:x in some_listinside a loop,list.remove(x),arr.insert(0, ...). - "Breaking early changes nothing." A loop with an early
breakhas best-case cost $O(1)$ and worst-case still matches the full sum. State which case you are analyzing. - "The summation is a proof." A summation is a restatement. You still need to justify the summand (cost per iteration) and the bounds (iteration count). Both are where bugs hide.
- "Pull $n$ out of the sum." Only when it is truly a constant with respect to the summation index. $\sum_{i=1}^{n} n/i = n H_n$ is fine because $n$ is constant over $i$. $\sum_{i=1}^{n} (n-i)$ is not $n \sum \cdots$ because the $i$ inside ties the two together; expand and use linearity instead.
- "Loop plus recursion is just a loop." If the body recurses, the per-iteration cost is itself the solution of a recurrence (concept 3). Substitute that solution into the summation; do not hand-wave.
How To Use It
When analyzing a loop:
- Identify the loop index and its range.
- Write $\sum_{\text{index} = \text{lower}}^{\text{upper}} \text{(cost per iteration)}$ as a single expression.
- If the cost per iteration is $\Theta(1)$, the sum is just the number of iterations.
- If the cost depends on $n$ or on the index, substitute it and evaluate. Pull constants out; do not drop factors that depend on $n$.
- Use the closed forms $\sum i = \Theta(n^2)$, $\sum i^2 = \Theta(n^3)$, $\sum 1/i = \Theta(\log n)$, $\sum r^i$ (geometric) = $\Theta(r^n)$ for $r > 1$.
- For multiplicative loop indices ($i \mathrel{*}= 2$ or $i /!/= 3$), the iteration count is $\Theta(\log_b n)$.
- Compose nested loops by substituting the inner sum into the body of the outer. Evaluate innermost first.
Always distinguish the count of iterations from the work per iteration. Multiply, do not guess. When the analysis disagrees with a runtime experiment by more than a constant factor, one of the substitutions is wrong.
Transfer / Where This Shows Up Later
- S1 M2 (counting) gave you the combinatorial identities ($\sum i$, $\binom{n}{2}$) that are now loop cost tallies.
- S1 M5 (problem solving) taught "size the candidate space"; loop analysis is the formal version.
- S4 (systems) analyzes cache line transfers by summing over inner loops with block size $B$; tiled matrix multiplication is a nested-sum argument with a new inner dimension.
- S5 (OS) uses the same harmonic and geometric sums for amortized scheduling cost and for I/O backoff.
- S6 (databases) sums row scans (full outer sum) and B-tree probes (geometric sum of $\log_B n$ levels) to estimate query cost.
- S7 (architecture) expresses fitness functions such as "request fan-out stays $O(1)$ per RPC" - a statement about inner loop cost in an RPC handler.
- S8 (scaling) costs background jobs like compaction, rebalancing, and index rebuilds by summing the per-unit work over the input.
Concept 3 (recurrences) is the recursive counterpart of this page. Concept 4 (amortized analysis) is how you salvage a bound when a single iteration can be expensive but the total across iterations is cheap.
Cross-module anchors: S2 M2 uses $\sum \log i \sim n \log n$ for heap build and radix sort; S2 M3 counts edge relaxations via $\sum_{v} \deg(v) = 2m$ for BFS/DFS; S2 M4 sizes DP runtime as (states) $\times$ (transitions per state) - a two-level sum; S2 M5 analyzes union-find's inverse-Ackermann factor as a telescoping harmonic-flavored sum.
Check Yourself
- Evaluate $\sum_{i=1}^{n} i$ in closed form and state its $\Theta$-class.
- Why is a
while i < n: i *= 2loop $\Theta(\log n)$ rather than $\Theta(n)$? - In the triangle pattern above, why is the total work $\Theta(n^2)$ even though no single loop runs $n^2$ times?
- In a loop
for i in range(n): L.insert(0, i)with a Pythonlist, what is the total cost and why? - Given a loop that at step $i$ does $f(i) = i^2$ work, what is the closed-form total and its $\Theta$-class?
- A function processes each of $n$ items and, for each item, walks a list that grows by one per step. What is the total work, and why is the "amortized" framing of concept 4 not needed to see it?
- You see a
whileloop whose condition depends on a variable you cannot obviously relate to $n$. What questions must you ask to bound iteration count?
Mini Drill or Application
For each snippet give the tightest $\Theta$ bound and justify with a sum.
for i in range(n): for j in range(n): work()for i in range(n): for j in range(i, n): work()i = n; while i > 1: work(); i //= 2for i in range(n): for j in range(1, n): for k in range(j): work()- A loop that appends
icopies of'x'to a Pythonstrat each step, collecting all output. - Implementation drill. Write a generator that yields pairs $(i, j)$ for the triangle pattern (inner bound = outer index) for $n$, together with an assertion that the yielded count equals $n(n-1)/2$. Complexity derivation: each yield is $\Theta(1)$, total iterations $\sum_{i=0}^{n-1} i = \Theta(n^2)$, total time $\Theta(n^2)$, total space $\Theta(1)$ with a generator.
Read This Only If Stuck
- Skiena: Reasoning about efficiency
- Skiena: Logarithms and summations
- CLRS: Analyzing algorithms (insertion sort analysis)
- CLRS: Insertion sort
- CLRS: Bounding summations (appendix A.2)
- Competitive Programming: Algorithm analysis, sizing
- MIT 6.042J Mathematics for CS - sums and asymptotics notes (harmonic and geometric sums)
- CP Algorithms: Harmonic sum bounds (illustrates $\sum n/i$ in divisor sieves)
- Concrete Mathematics (Graham, Knuth, Patashnik) - chapter on sums (rigorous reference for summation identities)
- Jeff Erickson, Algorithms - sums and recurrences (free PDF; readable companion to CLRS appendix A)