What a Process Is and How Syscalls Cross the Kernel Boundary
What This Concept Is
A process is an operating-system-managed container for a running program. Concretely, a process has:
- a private virtual address space (its own view of memory, zero-based, protected from other processes)
- a table of open file descriptors
- one or more threads of execution, each with a CPU register state and a stack
- a credential set (user id, group id) and an environment
- a process id (
pid) and a parent process id (ppid) - bookkeeping the kernel keeps about it: state, priority, signals pending, exit status
A process is not the same thing as a program. The program on disk is the executable file (/bin/ls); the process is a specific running instance of that program, created when some other process asked the kernel to start it.
A system call is how user code asks the kernel to do something it cannot do itself -- allocate memory, touch a file, start a thread, send a packet. From C it looks like a function call (read(fd, buf, n)), but under the hood it traps into the kernel: the CPU switches to a privileged mode, the kernel validates the arguments, performs the operation, and switches back. That boundary -- user mode up here, kernel mode down there -- is the most important line in systems programming.
Why It Matters Here
Almost every concept in the rest of this module is a named way to cross the boundary. fork creates a process. open gets the kernel to hand you a file descriptor. mmap asks it to install a new mapping into your address space. pthread_create asks it (indirectly) to schedule another thread. If you do not see the boundary clearly, the syscalls look like ordinary C functions and their failure modes -- EINTR, EAGAIN, partial reads, signals -- look like bugs in your own code.
The kernel boundary is also where safety comes from. Your process cannot accidentally corrupt another process's memory because it literally cannot address it. Every access to anything outside your address space has to go through a syscall, where the kernel checks permissions.
Concrete Example
Here is a minimal C program that prints "hello" by going through a syscall. printf is a library function that eventually calls write(1, ...), which is the syscall:
#include <unistd.h>
#include <stdio.h>
int main(void) {
write(1, "hello\n", 6);
printf("my pid is %d, my parent is %d\n", (int)getpid(), (int)getppid());
return 0;
}
Compile and run with strace to see the boundary crossings:
$ cc -o hello hello.c
$ strace -e trace=write,getpid,getppid ./hello
write(1, "hello\n", 6) = 6
getpid() = 12345
getppid() = 12340
write(1, "my pid is 12345, ...\n", 29) = 29
Each line in the strace output is one boundary crossing: write, getpid, getppid. Everything between those lines is pure user-mode C.
Common Confusion / Misconception
"A function call and a syscall are the same thing." They are not. A function call jumps inside your address space; a syscall triggers a mode switch and hands control to the kernel. Syscalls are much more expensive (hundreds of cycles for the mode switch alone) and can fail in ways a normal function cannot: the process can receive a signal mid-call, the kernel can return EINTR, the operation can partially succeed (reading 3 of 5 bytes). Treating read like memcpy is where most beginner systems bugs come from.
A related trap: "printf is a syscall." No. printf is a C library function that buffers and eventually calls write, which is the syscall. That buffering is why printf without a newline may not appear until you fflush or the program exits.
How To Use It
When reading any C program that touches the outside world, mentally tag each line:
- Is this pure computation? It stays in user mode.
- Is it a library function (
printf,malloc,fopen)? It may or may not reach the kernel. - Is it a documented syscall (
open,read,write,fork,mmap,socket)? This is a boundary crossing. - For every boundary crossing: read the man page's
RETURN VALUEandERRORSsections. Error handling is not optional here.
Check Yourself
- Why can one process not directly write into another process's memory, even if both are the same user?
- What does it mean for
writeto return3when you asked to write10bytes? - Why is
getpid()a syscall in principle but fast in practice?
Mini Drill or Application
Take the hello.c program above. Do all four:
- Run
strace -c ./helloand note which syscalls appear beforemainruns. Most of them are the dynamic linker. That is normal. - Replace
printfwithwrite(1, ...)directly and rerunstrace. Count the boundary crossings. It should drop. - Add a call to
open("/etc/hostname", O_RDONLY)and check theerrnopath on failure. Print it withperror. - Explain, in one sentence, why swapping
printfforwritecan change behavior whenstdoutis redirected to a pipe.
Read This Only If Stuck
- K&R 8.1: File Descriptors
- K&R 8.2: Low-Level I/O -- Read and Write
- Code: The Operating System (Part 1)
- Man page:
man 2 intro-- overview of Linux system calls - Man page:
man 7 process-keywords-- process identity and credentials