Chapter 14: Types Of Memory
This page is a generated reference surface for selective reading. It exists to keep the learner apps guide-first while still preserving source access.
Learning objectives
- Explain the main ideas and vocabulary in Types Of Memory.
- Work through the source examples for Types Of Memory without depending on raw chunk order.
- Use Types Of Memory as selective reference when learner modules point back to Ostep.
Prerequisites
- Earlier prerequisite concepts leading into Chapter 14: Types Of Memory.
Module targets
module-02-memory-management-virtual-memory
AI companion modes
- Explain simply
- Socratic tutor
- Quiz me
- Challenge my understanding
- Diagnose my confusion
- Generate extra practice
- Revision mode
- Connect forward / backward
Source-of-truth note
This unit is anchored to Ostep and the source chapter "Chapter 14: Types Of Memory". Use external resources only to clarify, extend, or modernize details without replacing the chapter's conceptual spine.
External enrichment
No chapter-specific enrichment resources are curated yet. Add them in the unit manifest when a source clearly improves learning.
Source provenance
- Primary source:
Ostep - Source chapter 14: Chapter 14: Types Of Memory
- Raw source file:
059-14-1-types-of-memory.md - Raw source file:
060-14-2-themalloc-call.md - Raw source file:
061-14-4-common-errors.md - Raw source file:
063-14-5-underlying-os-support.md
Merged source
Types Of Memory
14.1 Types of Memory
14 Interlude: Memory API
In this interlude, we discuss the memory allocation interfaces in UNIX systems. The interfaces provided are quite simple, and hence the chapter is short and to the point1. The main problem we address is this:
CRUX: HOWTOALLOCATEANDMANAGEMEMORY
In UNIX/C programs, understanding how to allocate and manage memory is critical in building robust and reliable software. What interfaces are commonly used? What mistakes should be avoided?
In running a C program, there are two types of memory that are allocated. The first is calledstackmemory, and allocations and deallocations of it are managedimplicitlyby the compiler for you, the programmer; for this reason it is sometimes calledautomaticmemory.
Declaring memory on the stack in C is easy. For example, let's say you need some space in a functionfunc()for an integer, calledx. To declare such a piece of memory, you just do something like this:
void func() {
int x; // declares an integer on the stack
...
}
The compiler does the rest, making sure to make space on the stack when you call intofunc(). When you return from the function, the compiler deallocates the memory for you; thus, if you want some information to live beyond the call invocation, you had better not leave that information on the stack.
It is this need for long-lived memory that gets us to the second type of memory, calledheapmemory, where all allocations and deallocations 1Indeed, we hope all chapters are! But this one is shorter and pointier, we think.
1 areexplicitly handled by you, the programmer. A heavy responsibility, no doubt! And certainly the cause of many bugs. But if you are careful and pay attention, you will use such interfaces correctly and without too much trouble. Here is an example of how one might allocate a pointer to an integer on the heap:
void func() {
int *x = (int *) malloc(sizeof(int));
...
}
A couple of notes about this small code snippet. First, you might notice that both stack and heap allocation occur on this line: first the compiler knows to make room for a pointer to an integer when it sees your declaration of said pointer (int *x); subsequently, when the program callsmalloc(), it requests space for an integer on the heap; the routine returns the address of such an integer (upon success, orNULLon failure), which is then stored on the stack for use by the program.
Because of its explicit nature, and because of its more varied usage, heap memory presents more challenges to both users and systems. Thus, it is the focus of the remainder of our discussion.
Themalloc Call
14.2 Themalloc()Call
Themalloc()call is quite simple: you pass it a size asking for some room on the heap, and it either succeeds and gives you back a pointer to the newly-allocated space, or fails and returnsNULL2.
The manual page shows what you need to do to use malloc; typeman mallocat the command line and you will see:
#include <stdlib.h>
...
void *malloc(size_t size);
From this information, you can see that all you need to do is include the header filestdlib.hto use malloc. In fact, you don't really need to even do this, as the C library, which all C programs link with by default, has the code formalloc()inside of it; adding the header just lets the compiler check whether you are callingmalloc()correctly (e.g., passing the right number of arguments to it, of the right type).
The single parametermalloc()takes is of typesizetwhich simply describes how many bytes you need. However, most programmers do not type in a number here directly (such as 10); indeed, it would be considered poor form to do so. Instead, various routines and macros are utilized. For example, to allocate space for a double-precision floating point value, you simply do this:
double *d = (double *) malloc(sizeof(double));
2Note thatNULLin C isn't really anything special at all, just a macro for the value zero.
TIP: WHENINDOUBT, TRYITOUT
If you aren't sure how some routine or operator you are using behaves, there is no substitute for simply trying it out and making sure it behaves as you expect. While reading the manual pages or other documentation is useful, how it works in practice is what matters. Write some code and test it! That is no doubt the best way to make sure your code behaves as you desire. Indeed, that is what we did to double-check the things we were saying aboutsizeof()were actually true!
Wow, that's lot ofdouble-ing! This invocation ofmalloc()uses the sizeof()operator to request the right amount of space; in C, this is generally thought of as acompile-timeoperator, meaning that the actual size is known at compile time and thus a number (in this case, 8, for a
double) is substituted as the argument to malloc(). For this reason,
sizeof()is correctly thought of as an operator and not a function call (a function call would take place at run time).
You can also pass in the name of a variable (and not just a type) to sizeof(), but in some cases you may not get the desired results, so be careful. For example, let's look at the following code snippet:
int *x = malloc(10 * sizeof(int));
printf("%d\n", sizeof(x));
In the first line, we've declared space for an array of
integers, which is fine and dandy. However, when we usesizeof()in the next line, it returns a small value, such as
(on 32-bit machines) or
(on 64-bit machines). The reason is that in this case,sizeof()thinks we are simply asking how big apointerto an integer is, not how much memory we have dynamically allocated. However, sometimessizeof()does work as you might expect:
int x[10];
printf("%d\n", sizeof(x));
In this case, there is enough static information for the compiler to know that 40 bytes have been allocated.
Another place to be careful is with strings. When declaring space for a
string, use the following idiom:malloc(strlen(s) + 1), which gets
the length of the string using the functionstrlen(), and adds 1 to it in order to make room for the end-of-string character. Usingsizeof() may lead to trouble here.
You might also notice thatmalloc()returns a pointer to typevoid.
Doing so is just the way in C to pass back an address and let the programmer decide what to do with it. The programmer further helps out by using what is called acast; in our example above, the programmer
casts the return type of malloc()to a pointer to a double. Casting
doesn't really accomplish anything, other than tell the compiler and other programmers who might be reading your code: "yeah, I know what I'm doing." By casting the result ofmalloc(), the programmer is just giving some reassurance; the cast is not needed for the correctness.
14.3 Thefree()Call
As it turns out, allocating memory is the easy part of the equation; knowing when, how, and even if to free memory is the hard part. To free heap memory that is no longer in use, programmers simply callfree():
int *x = malloc(10 * sizeof(int));
...
free(x);
The routine takes one argument, a pointer that was returned bymalloc().
Thus, you might notice, the size of the allocated region is not passed in by the user, and must be tracked by the memory-allocation library itself.
14.4 Common Errors
There are a number of common errors that arise in the use ofmalloc() andfree(). Here are some we've seen over and over again in teaching the undergraduate operating systems course. All of these examples compile and run with nary a peep from the compiler; while compiling a C program is necessary to build a correct C program, it is far from sufficient, as you will learn (often in the hard way).
Correct memory management has been such a problem, in fact, that many newer languages have support for automatic memory management. In such languages, while you call something akin tomalloc() to allocate memory (usuallynewor something similar to allocate a new object), you never have to call something to free space; rather, agarbage collector runs and figures out what memory you no longer have references to and frees it for you.
Forgetting To Allocate Memory
Many routines expect memory to be allocated before you call them. For
example, the routinestrcpy(dst, src)copies a string from a source pointer to a destination pointer. However, if you are not careful, you might do this:
char *src = "hello";
char *dst; // oops! unallocated
strcpy(dst, src); // segfault and die
When you run this code, it will likely lead to asegmentation fault3, which is a fancy term for YOU DID SOMETHING WRONG WITH
Common Errors
14.4 Common Errors
MEMORY YOU FOOLISH PROGRAMMER AND I AM ANGRY.
3Although it sounds arcane, you will soon learn why such an illegal memory access is called a segmentation fault; if that isn't incentive to read on, what is?
TIP: ITCOMPILED ORITRAN6=ITISCORRECT
Just because a program compiled(!) or even ran once or many times correctly does not mean the program is correct. Many events may have conspired to get you to a point where you believe it works, but then something changes and it stops. A common student reaction is to say (or yell) "But it worked before!" and then blame the compiler, operating system, hardware, or even (dare we say it) the professor. But the problem is usually right where you think it would be, in your code. Get to work and debug it before you blame those other components.
In this case, the proper code might instead look like this:
char *src = "hello";
char *dst = (char *) malloc(strlen(src) + 1);
strcpy(dst, src); // work properly
Alternately, you could usestrdup()and make your life even easier.
Read thestrdupman page for more information.
Not Allocating Enough Memory
A related error is not allocating enough memory, sometimes called abuffer overflow. In the example above, a common error is to makealmostenough room for the destination buffer.
char *src = "hello";
char *dst = (char *) malloc(strlen(src)); // too small!
strcpy(dst, src); // work properly
Oddly enough, depending on how malloc is implemented and many other details, this program will often run seemingly correctly. In some cases, when the string copy executes, it writes one byte too far past the end of the allocated space, but in some cases this is harmless, perhaps overwriting a variable that isn't used anymore. In some cases, these overflows can be incredibly harmful, and in fact are the source of many security vulnerabilities in systems [W06]. In other cases, the malloc library allocated a little extra space anyhow, and thus your program actually doesn't scribble on some other variable's value and works quite fine. In even other cases, the program will indeed fault and crash. And thus we learn another valuable lesson: even though it ran correctly once, doesn't mean it's correct.
Forgetting to Initialize Allocated Memory
With this error, you callmalloc()properly, but forget to fill in some values into your newly-allocated data type. Don't do this! If you do forget, your program will eventually encounter anuninitialized read, where it reads from the heap some data of unknown value. Who knows what might be in there? If you're lucky, some value such that the program still works (e.g., zero). If you're not lucky, something random and harmful.
Underlying Os Support
14.5 Underlying OS Support
ASIDE: WHYNOMEMORYISLEAKEDONCEYOURPROCESSEXITS
When you write a short-lived program, you might allocate some space usingmalloc(). The program runs and is about to complete: is there need to callfree()a bunch of times just before exiting? While it seems wrong not to, no memory will be "lost" in any real sense. The reason is simple: there are really two levels of memory management in the system.
The first is level of memory management is performed by the OS, which hands out memory to processes when they run, and takes them back when processes exit (or otherwise die). The second level of management iswithineach process, for example within the heap when you call
malloc()and free(). Even if you fail to callfree()(and thus leak
memory in the heap), the operating system will reclaimallthe memory of the process (including those pages for code, stack, and, as relevant here, heap) when the program is finished running. No matter what the state of your heap in your address space, the OS takes back all of those pages when the process dies, thus ensuring that no memory is lost despite the fact that you didn't free it.
Thus, for short-lived programs, leaking memory often does not cause any operational problems (though it may be considered poor form). When you write a long-running server (such as a web server or database management system, which never exit), leaked memory is a much bigger issue, and will eventually lead to a crash when the application runs out of memory. And of course, leaking memory is an even larger issue inside one particular program: the operating system itself. Showing us once again: those who write the kernel code have the toughest job of all...
Summary
As you can see, there are lots of ways to abuse memory. Because of frequent errors with memory, a whole ecosphere of tools have developed to help find such problems in your code. Check out bothpurify[HJ92] and valgrind[SN05]; both are excellent at helping you locate the source of your memory-related problems. Once you become accustomed to using these powerful tools, you will wonder how you survived without them.
You might have noticed that we haven't been talking about system calls when discussingmalloc()andfree(). The reason for this is simple: they are not system calls, but rather library calls. Thus the malloc library manages space within your virtual address space, but itself is built on top of some system calls which call into the OS to ask for more memory or release some back to the system.
One such system call is calledbrk, which is used to change the location of the program'sbreak: the location of the end of the heap. It takes one argument (the address of the new break), and thus either increases or decreases the size of the heap based on whether the new break is larger or smaller than the current break. An additional callsbrkis passed an increment but otherwise serves a similar purpose.
Note that you should never directly call either brkor sbrk. They
are used by the memory-allocation library; if you try to use them, you will likely make something go (horribly) wrong. Stick tomalloc()and
free()instead.
Finally, you can also obtain memory from the operating system via the mmap()call. By passing in the correct arguments,mmap()can create an anonymousmemory region within your program -- a region which is not associated with any particular file but rather withswap space, something we'll discuss in detail later on in virtual memory. This memory can then also be treated like a heap and managed as such. Read the manual page ofmmap()for more details.
14.6 Other Calls
There are a few other calls that the memory-allocation library supports. For example,calloc()allocates memory and also zeroes it before returning; this prevents some errors where you assume that memory is zeroed and forget to initialize it yourself (see the paragraph on "uninitialized reads" above). The routinerealloc()can also be useful, when you've allocated space for something (say, an array), and then need to add something to it: realloc()makes a new larger region of memory, copies the old region into it, and returns the pointer to the new region.
14.7 Summary
We have introduced some of the APIs dealing with memory allocation.
As always, we have just covered the basics; more details are available elsewhere. Read the C book [KR88] and Stevens [SR05] (Chapter 7) for more information. For a cool modern paper on how to detect and correct many of these problems automatically, see Novark et al. [N+07]; this paper also contains a nice summary of common problems and some neat ideas on how to find and fix them.
[HJ92] Purify: Fast Detection of Memory Leaks and Access Errors
R. Hastings and B. Joyce
USENIX Winter '92
The paper behind the cool Purify tool, now a commercial product.
[KR88] "The C Programming Language"
Brian Kernighan and Dennis Ritchie
Prentice-Hall 1988
The C book, by the developers of C. Read it once, do some programming, then read it again, and then keep it near your desk or wherever you program.
[N+07] "Exterminator: Automatically Correcting Memory Errors with High Probability"
Gene Novark, Emery D. Berger, and Benjamin G. Zorn
PLDI 2007
A cool paper on finding and correcting memory errors automatically, and a great overview of many common errors in C and C++ programs.
[SN05] "Using Valgrind to Detect Undefined Value Errors with Bit-precision"
J. Seward and N. Nethercote
USENIX '05
How to use valgrind to find certain types of errors.
[SR05] "Advanced Programming in the UNIXEnvironment"
W. Richard Stevens and Stephen A. Rago
Addison-Wesley, 2005
We've said it before, we'll say it again: read this book many times and use it as a reference whenever you are in doubt. The authors are always surprised at how each time they read something in this book, they learn something new, even after many years of C programming.
[W06] "Survey on Buffer Overflow Attacks and Countermeasures"
Tim Werthman
Available: www.nds.rub.de/lehre/seminar/SS06/WerthmannBufferOverflow.pdf
A nice survey of buffer overflows and some of the security problems they cause. Refers to many of the famous exploits.