The ALU and How Numbers Are Added and Multiplied
What This Concept Is
The arithmetic-logic unit (ALU) is the part of the CPU that computes on fixed-width integers. For every cycle it is driven by control signals it produces one of a small set of results:
- arithmetic: add, subtract, negate, compare, increment, decrement
- logical: AND, OR, XOR, NOT
- shifts: left, logical right, arithmetic right
- flags: zero (
Z), sign (S/N), carry (C), overflow (V/O)
Integers are stored in two's complement: the top bit is a sign, -x is ~x + 1, and the same binary adder handles signed and unsigned addition. Addition ripples carries through a chain of full-adders; multiplication is traditionally a sequence of shifts and adds, though modern ALUs use a multi-cycle hardware multiplier (often Booth or Wallace-tree) that produces a 64-bit product from two 32-bit operands in a handful of cycles.
Why It Matters Here
Nearly every instruction ultimately lands on the ALU. Understanding it lets you:
- predict the cost of
addvsmulvsdiv(one cycle, a few cycles, tens of cycles) - read flag-dependent branches (
jz,jo,js,jc) without guessing - notice when a compiler converts
x * 17into(x << 4) + xbecause shifts and adds are cheaper than multiplies - reason about integer overflow, signed vs unsigned comparison, and the subtle bugs that come from mixing them
Concrete Example
Two's complement addition on an 8-bit ALU:
01001101 (77)
+ 00110110 (54)
-----------
10000011 (-125 as signed, 131 as unsigned)
carry out = 0, overflow = 1 -> signed overflow occurred
The bits are the same; the interpretation changes. The ALU sets V=1 because two positives summed to a negative -- the signed result is wrong. For unsigned math the answer is fine and C=0.
Compare x == y via subtraction:
sub %rbx, %rax # rax -= rbx; flags set from the result
je equal # jump if Z==1
There is no separate equality circuit; subtraction is the comparison, and the Z flag carries the answer.
Multiplication via shift-and-add (how the hardware does it for a 4x4):
a = 1011 (11)
b = 0110 (6)
b bit 0 = 0 -> 0000
b bit 1 = 1 -> 10110 (a << 1)
b bit 2 = 1 -> 101100 (a << 2)
b bit 3 = 0 -> 0
sum = 1000010 = 66 ✔
Modern CPUs do all four partial products in parallel, then collapse them with a carry-save adder tree, so an integer multiply typically costs 3-5 cycles, not one.
Common Confusion / Misconception
"== is free but * is expensive." Equality still goes through the ALU (it is a subtract and a flag read), and on most modern cores it costs exactly the same as any other ALU op. Multiplication has higher latency than add but is not serial; the CPU can often issue an independent multiply every cycle. Integer division is the one that really hurts -- often 20-40 cycles and not pipelined.
How To Use It
- When tuning inner loops, prefer add/sub/shift/AND/OR/XOR. Replace multiplies by constants with shifts and adds when the compiler will not.
- Respect signed overflow in C:
intoverflow is undefined behaviour. Useunsignedwhen you truly want modular arithmetic. - For comparisons, remember that
<onintuses the signed flags (S,V) while<onunsigneduses the carry flag (C). That is why mixing signed and unsigned ina < bcan return the wrong answer. - For cost modelling, remember: add ~1 cycle, multiply ~3-5 cycles, divide ~20+ cycles, shifts by a constant ~1 cycle.
Check Yourself
- Why does the same adder work for both signed and unsigned addition?
- What do the
Z,N,C, andVflags represent, and which instruction sets them? - How does a
cmp a, bfollowed byjl targetactually decide whether to jump? - Why does integer division cost so much more than multiplication on a modern CPU?
Mini Drill or Application
For each expression, list the ALU operations and flag reads you would expect in compiled x86_64 code:
x == yx < ywhere both areintx < ywhere both areunsigneda * 9a / 7whereais a non-constantunsigned long
Check in Compiler Explorer at -O2. Note the strength-reduction tricks the compiler uses for constant multiplies and divides (magic numbers and multi-shifts).
Read This Only If Stuck
- Computer Organization and Design: 3.1 Introduction
- Computer Organization and Design: 3.2 Addition and Subtraction
- Computer Organization and Design: 3.3 Multiplication
- Computer Organization and Design: 3.4 Division
- Computer Organization and Design: 2.4 Signed and Unsigned Numbers
- Computer Organization and Design: 2.6 Logical Operations
- CODE: The Arithmetic Logic Unit