Virtual Memory Foundations: Address Spaces, Paging, and Protection
This article covers Virtual Memory Foundations: Address Spaces, Paging, and Protection. A detailed, practical walkthrough of virtual memory: pages, page tables, permissions, and what happens during a page fault — with C, Zig, and Rust...
Virtual memory is the contract between your program and the OS: you get a private address space that looks contiguous and huge, while the OS maps that illusion onto real physical memory and files.
This post builds a mental model you can carry into debugging crashes, tuning performance, and understanding tools like sanitizers and profilers.
1) What “virtual” means
Every pointer in your program is a virtual address.
- The CPU translates virtual addresses to physical addresses using hardware called the MMU (Memory Management Unit).
- Translation happens at page granularity (commonly 4 KiB, sometimes 2 MiB “huge pages”).
- The OS defines permissions (read/write/execute) for each page.
The result:
- Isolation: one process cannot read/write another process’s memory.
- Convenient layout: code, heap, stack, shared libraries, mapped files.
- Demand paging: memory can be allocated lazily; pages are loaded when touched.
2) Pages, page tables, and the TLB
A page is a fixed-size block of memory.
The OS maintains per-process page tables mapping:
- virtual page number → physical frame number
- plus flags: Present, Read/Write, User/Supervisor, NX (no-execute), etc.
To avoid walking page tables for every load/store, CPUs cache translations in the TLB (Translation Lookaside Buffer).
Performance implication:
- Random access with poor locality can cause more TLB misses.
- Huge pages can reduce TLB pressure (fewer entries needed).
3) Page faults: the good kind and the bad kind
A page fault occurs when the CPU can’t translate a virtual address with the required permissions.
Common cases:
- Demand paging (good): the page exists conceptually but isn’t loaded yet.
- Copy-on-write (good):
fork()shares pages read-only until a write. - Protection fault (bad): touching unmapped memory or violating permissions.
You typically see protection faults as:
- SIGSEGV (segmentation fault)
- SIGBUS (bad memory access, e.g. some mmap issues)
4) Memory mapping: heap vs mmap
Most allocators use a combination of:
brk/sbrkfor extending the traditional heapmmapfor larger allocations and arenas
Files can also be mapped with mmap, turning file I/O into page faults and cache effects.
5) Practical experiment: force a fault
Below are small experiments showing:
- reading/writing mapped pages
- triggering faults by accessing unmapped memory
C: allocate a page, change permissions, then fault
#define _GNU_SOURCE
#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>
int main(void) {
long page = sysconf(_SC_PAGESIZE);
void *p = mmap(NULL, (size_t)page, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (p == MAP_FAILED) {
fprintf(stderr, "mmap failed: %s\n", strerror(errno));
return 1;
}
((char*)p)[0] = 'A';
if (mprotect(p, (size_t)page, PROT_READ) != 0) {
fprintf(stderr, "mprotect failed: %s\n", strerror(errno));
return 1;
}
// This write should crash with SIGSEGV on most systems.
((char*)p)[0] = 'B';
munmap(p, (size_t)page);
return 0;
}
What to observe:
- The first write works.
- After
mprotect(PROT_READ), a write triggers a protection fault.
Zig: mmap and mprotect
const std = @import("std");
const os = std.os;
pub fn main() !void {
const page = try os.sysconf(os._SC.PAGESIZE);
const p = try os.mmap(
null,
@as(usize, @intCast(page)),
os.PROT.READ | os.PROT.WRITE,
os.MAP.PRIVATE | os.MAP.ANONYMOUS,
-1,
0,
);
defer os.munmap(p);
const bytes = @as([*]u8, @ptrCast(p));
bytes[0] = 'A';
try os.mprotect(p, @as(usize, @intCast(page)), os.PROT.READ);
// Typically triggers SIGSEGV.
bytes[0] = 'B';
}
Rust: map memory and trigger a protection fault
use std::io;
fn main() -> io::Result<()> {
unsafe {
let page = libc::sysconf(libc::_SC_PAGESIZE) as usize;
let p = libc::mmap(
std::ptr::null_mut(),
page,
libc::PROT_READ | libc::PROT_WRITE,
libc::MAP_PRIVATE | libc::MAP_ANON,
-1,
0,
);
if p == libc::MAP_FAILED {
return Err(io::Error::last_os_error());
}
let bytes = p as *mut u8;
*bytes = b'A';
if libc::mprotect(p, page, libc::PROT_READ) != 0 {
return Err(io::Error::last_os_error());
}
// Typically crashes with SIGSEGV.
*bytes = b'B';
libc::munmap(p, page);
}
Ok(())
}
6) Debugging and measurement tips
- Inspect mappings:
cat /proc/<pid>/maps(Linux) shows memory regions and permissions. - Count faults:
perf stat -e page-faults,minor-faults,major-faults .... - mmap vs read/write: mmap can be great, but page-fault patterns matter.
7) Common pitfalls
- Assuming pointers are “real addresses”: they are only valid in your process context.
- Ignoring page size: alignment and mapping sizes should usually be multiples of page size.
- Confusing SIGSEGV and SIGBUS: both are memory access signals, but causes differ.
References
- Linux
mmap(2): https://man7.org/linux/man-pages/man2/mmap.2.html - Linux
mprotect(2): https://man7.org/linux/man-pages/man2/mprotect.2.html - “What Every Programmer Should Know About Memory” (Ulrich Drepper): https://people.freebsd.org/~lstewart/articles/cpumemory.pdf
- Intel® 64 and IA-32 Architectures Software Developer’s Manual (paging/TLB): https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html