Integer Representations: Unsigned, Two's Complement, Overflow
What This Concept Is
A fixed-width integer type in C stores its value in exactly N bits, interpreted in one of two ways:
- Unsigned: the bit pattern is read as a plain base-2 number from
0to2^N - 1. - Signed two's complement: the top bit has weight
-2^(N-1)instead of+2^(N-1). Values range from-2^(N-1)to2^(N-1) - 1.
Two's complement has three features that make it the universal choice:
- addition and subtraction are identical for signed and unsigned at the bit level
- there is exactly one representation of zero
- negation is "invert all bits, then add 1"
Everything that looks like arithmetic magic (-1 == 0xFFFFFFFF, x ^ -x, INT_MIN == -INT_MIN) flows from those rules.
Why It Matters Here
Once this is wrong, nothing is safe:
- loops like
for (unsigned i = n; i >= 0; i--)never terminate becauseiwraps int a = INT_MAX; int b = a + 1;is undefined behavior, not just "a big negative number"- comparing signed and unsigned often silently converts the signed side to unsigned
size_tsubtraction can produce a huge positive number instead of a negative one
These are the bugs that survive review because "the code looks right."
Concrete Example
On a 32-bit int32_t:
0x00000001is+10xFFFFFFFFis-1(because the top bit weighs-2^31, and-2^31 + 2^31 - 1 = -1)0x80000000isINT_MIN = -21474836480x7FFFFFFFisINT_MAX = 2147483647
To negate 5 by the rule: 5 = 0x00000005. Invert: 0xFFFFFFFA. Add 1: 0xFFFFFFFB = -5. Check: -5 as two's complement weighs -2^31 + 2^30 + ... + 2^1 + 2^0 = -5.
Adding 1 to INT_MAX:
0x7FFFFFFF
+ 0x00000001
= 0x80000000 which reads as INT_MIN
In C, this signed overflow is undefined behavior. The same bit-level addition on uint32_t would be defined: unsigned wraps modulo 2^32.
Common Confusion / Misconception
"Overflow in C wraps around." Only for unsigned types. For signed types it is undefined behavior, and optimizers exploit that: if (x + 1 < x) can be replaced by if (0) on signed x.
"I can compare size_t with int safely." The usual arithmetic conversions promote the signed side to unsigned. (int)(-1) < (size_t)1 is false because -1 becomes the huge unsigned value SIZE_MAX.
How To Use It
For every integer operation you write:
- State the width and signedness.
- Ask whether the operation can overflow. If yes, pick the right type or check first.
- When mixing signed and unsigned, convert explicitly and comment why.
- Prefer
int32_t,uint32_t,int64_t,size_t,ssize_tover plainint,unsigned.
Check Yourself
- What is the bit pattern of
-128inint8_t? Of127? - Why does
uint32_t x = 0; x -= 1;setxto0xFFFFFFFFrather than crashing? - What does
-INT_MINequal, and why is that troubling?
Mini Drill or Application
#include <stdio.h>
#include <limits.h>
int main(void) {
unsigned char u = 200;
u += 100; /* defined: wraps to 44 */
printf("u = %u\n", u);
int s = INT_MAX;
printf("s + 1 = %d\n", s + 1); /* undefined behavior: do not rely on it */
return 0;
}
Compile with gcc -Wall -Wextra -fsanitize=undefined -o ovf ovf.c and run. Note what the sanitizer reports. The sanitizer is what makes undefined behavior visible.