Writing a Custom Allocator: Free Lists and Arenas

What This Concept Is

malloc and free are C library functions, not syscalls. They manage memory that was obtained from the kernel in large chunks (via sbrk or mmap) and subdivide it among many small allocations. If you understand how to write one, you understand what malloc is hiding.

Three allocator shapes cover most real designs:

Bump allocator (arena). Hold a big buffer and a cursor; alloc(n) returns the cursor and advances it by n. free does nothing; you release the entire arena at once. Ideal for per-request or per-frame allocations.
Free-list allocator. Each freed block is linked into a list (often sorted or bucketed by size). alloc(n) scans the list for a fit (first-fit, best-fit) and splits the block; free returns the block and coalesces adjacent free blocks. K&R Chapter 8.7 presents a clean version.
Slab / size-class allocator. One free list per fixed size (8, 16, 32, ...). alloc and free are O(1) within a size class. Modern malloc implementations (jemalloc, tcmalloc, glibc ptmalloc) are size-class allocators with per-thread caches.

Why It Matters Here

You will rarely replace the system malloc, but you will often build arenas: for a parser's AST, a request handler, a game frame, a compiler pass. Arenas turn thousands of malloc/free pairs into one mmap and one munmap, cutting allocation cost to near-zero and eliminating whole categories of leaks.

Writing even a toy allocator also demystifies a pile of "mysterious" crashes: double-frees, use-after-free, heap-overflow-overwrites-free-list -- all become obvious once you can picture the free list.

Concrete Example

A minimal bump arena over an mmap'd region:

#include <sys/mman.h>
#include <stdint.h>
#include <stddef.h>
#include <stdio.h>

typedef struct {
    char   *base;
    size_t  cap;
    size_t  used;
} Arena;

static Arena arena_create(size_t cap) {
    void *p = mmap(NULL, cap, PROT_READ | PROT_WRITE,
                   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    return (Arena){ .base = p, .cap = cap, .used = 0 };
}

static void *arena_alloc(Arena *a, size_t n, size_t align) {
    uintptr_t cur  = (uintptr_t)a->base + a->used;
    uintptr_t aln  = (cur + (align - 1)) & ~(align - 1);
    size_t    need = (size_t)(aln - (uintptr_t)a->base) + n;
    if (need > a->cap) return NULL;
    a->used = need;
    return (void*)aln;
}

static void arena_reset(Arena *a)   { a->used = 0; }
static void arena_destroy(Arena *a) { munmap(a->base, a->cap); a->base = NULL; }

int main(void) {
    Arena a = arena_create(1 << 20);
    int *xs = arena_alloc(&a, 1000 * sizeof(int), _Alignof(int));
    for (int i = 0; i < 1000; i++) xs[i] = i;
    printf("sum=%d used=%zu\n", xs[999], a.used);
    arena_destroy(&a);
}

No free per object. The whole mental model is: allocate freely, release all at once.

For a free-list allocator, the canonical worked example is K&R 8.7. Each free block has a header with a size and a next pointer; alloc scans the circular list for a block large enough and splits it; free inserts the block in sorted order and coalesces with neighbors.

Common Confusion / Misconception

"malloc asks the kernel for memory on every call." No. A good malloc asks the kernel once for a large chunk, then serves many small requests from it. You can observe this: strace a program doing a million tiny allocations, and you will see only a handful of brk or mmap calls.

"I can free memory I got from my arena." No. If your API has no per-object free, respect that. Mixing allocators (arena + libc free) is a guaranteed crash -- the libc free expects a libc-allocated block, and your arena blocks do not have the right headers.

"Fragmentation is only a problem on 32-bit systems." Fragmentation still matters: a 64-bit server that runs for days can degrade in throughput because allocation metadata no longer fits in cache. Size-class allocators exist specifically to bound fragmentation.

How To Use It

Pick the allocator shape by the lifetime of the objects, not the count:

One "generation" of objects that all die together -> arena.
Many short-lived objects with varying sizes -> libc malloc is fine.
One dominant size used heavily -> slab/pool allocator of that size.
Long-running process with mixed sizes and high throughput -> keep libc; reach for jemalloc/tcmalloc before rolling your own.

Check Yourself

Why is an arena often faster than many malloc/free pairs?
What is the role of coalescing in a free-list allocator?
Why is calling libc free on a pointer from your arena a bug?

Mini Drill or Application

Do all four:

Run the bump arena above and add a stress test: 10 million tiny allocations. Time it against the equivalent loop using malloc/free.
Read K&R 8.7 and transcribe the alloc/free implementation into your own file. Run its example and explain each field of Header.
Add a "checkpoint/restore" API to your arena: size_t mark = a.used; ...; a.used = mark;. Show how it enables per-scope cleanup.
In one sentence, explain why double-freeing in a free-list allocator can corrupt other allocations, not just the duplicated one.

Read This Only If Stuck

K&R 8.7: Example -- A Storage Allocator -- the canonical free-list allocator
K&R 5.4: Address Arithmetic
Man page: man 3 malloc
Man page: man 2 brk
Man page: man 2 mmap -- as the big-region primitive

What This Concept Is​

Why It Matters Here​

Concrete Example​

Common Confusion / Misconception​

How To Use It​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​