Structs, Unions, and typedef
What This Concept Is
C gives you three tools for building compound types:
struct- an aggregate with named members, each occupying its own storage. The total size is at least the sum of members plus any padding the compiler inserts for alignment.union- several members sharing the same storage. At any moment only one member is "active." Reading another member is usually implementation-defined or undefined.typedef- an alias for an existing type name. It does not create a new type; it gives the existing one a new spelling.
Access members with . on a struct/union value and -> on a pointer to one: p->x is (*p).x.
Why It Matters Here
Every non-trivial C program introduces its own types. The question is not "should I use a struct?" but:
- what invariants does this type carry?
- what is its memory layout?
- does
typedefmake it easier or harder to read? - would a
unionplus a tag ("tagged union") express intent more clearly than parallel fields?
Struct design habits set up Module 2's memory work and Module 3's architecture understanding (alignment, cache lines).
Concrete Example
A point and a tagged value:
#include <stdio.h>
struct point { int x; int y; }; /* tag-name form */
typedef struct { /* typedef of anonymous struct */
double lat;
double lon;
} Location;
typedef struct {
enum { TAG_INT, TAG_STR } tag;
union {
int i;
const char *s;
} u;
} Value;
void print_value(const Value *v) {
switch (v->tag) {
case TAG_INT: printf("int %d\n", v->u.i); break;
case TAG_STR: printf("str %s\n", v->u.s); break;
}
}
int main(void) {
struct point p = {1, 2};
Location here = {37.77, -122.42};
Value a = { .tag = TAG_INT, .u.i = 42 };
Value b = { .tag = TAG_STR, .u.s = "hi" };
printf("%d %d / %f %f\n", p.x, p.y, here.lat, here.lon);
print_value(&a);
print_value(&b);
return 0;
}
Designated initializers (.tag = TAG_INT) are C99.
Common Confusion / Misconception
"struct point p and point p are the same." Only if you added typedef struct { ... } point;. Without typedef, you must keep the struct keyword: struct point p;. Some codebases (Linux kernel) intentionally keep struct everywhere to make the aggregate visible at every use.
"sizeof(struct S) is the sum of its members." Usually bigger, because of padding for alignment. A struct { char c; int i; } on most platforms is 8 bytes, not 5, because int wants 4-byte alignment. Put large members first or pack deliberately if it matters.
"union lets me write one member and read another as a different type." That is called type punning. In C99+, reading from a different union member has implementation-defined behavior in some cases and is commonly used for low-level work, but it is still risky; check your compiler's rules.
How To Use It
- Give each struct a clear invariant ("
.sizeis always the number of live elements in.data"). Maintain the invariant in every function that takes the struct. - Use
typedefto hide opaque types (only a pointer escapes the header) or where a specific alias makes the domain clearer. Do nottypedeftypes whose internals you want callers to see. - Default to pass-by-pointer for non-trivial structs (>16 bytes roughly) to avoid copies.
- Use
uniononly with an explicit tag, or for well-documented hardware-register overlays. - Do not reach for
#pragma packcasually; reordering members to pack naturally is portable and safer.
Check Yourself
- Why might
sizeof(struct { char c; int i; })be8rather than5? - What is the difference between
struct nodeandtypedef struct node Node;? - When is a
unionthe right design and when is it a smell?
Mini Drill or Application
- Define
struct buffer { char *data; size_t len; size_t cap; };. Writebuffer_init,buffer_append_char,buffer_free. - Rewrite the same type as
typedef struct buffer Buffer;and decide which reads better for your callers. - Design a tagged union
JsonwithNULL,BOOL,NUMBER,STRINGvariants. Write aprint_json(const Json *j). - Print
sizeoffor several structs with different member orderings and explain the sizes using alignment.