Virtual Memory Foundations: Address Spaces, Paging, and Protection

This article covers Virtual Memory Foundations: Address Spaces, Paging, and Protection. A detailed, practical walkthrough of virtual memory: pages, page tables, permissions, and what happens during a page fault — with C, Zig, and Rust...

Virtual memory is the contract between your program and the OS: you get a private address space that looks contiguous and huge, while the OS maps that illusion onto real physical memory and files.

This post builds a mental model you can carry into debugging crashes, tuning performance, and understanding tools like sanitizers and profilers.

1) What “virtual” means

Every pointer in your program is a virtual address.

  • The CPU translates virtual addresses to physical addresses using hardware called the MMU (Memory Management Unit).
  • Translation happens at page granularity (commonly 4 KiB, sometimes 2 MiB “huge pages”).
  • The OS defines permissions (read/write/execute) for each page.

The result:

  • Isolation: one process cannot read/write another process’s memory.
  • Convenient layout: code, heap, stack, shared libraries, mapped files.
  • Demand paging: memory can be allocated lazily; pages are loaded when touched.

2) Pages, page tables, and the TLB

A page is a fixed-size block of memory.

The OS maintains per-process page tables mapping:

  • virtual page number → physical frame number
  • plus flags: Present, Read/Write, User/Supervisor, NX (no-execute), etc.

To avoid walking page tables for every load/store, CPUs cache translations in the TLB (Translation Lookaside Buffer).

Performance implication:

  • Random access with poor locality can cause more TLB misses.
  • Huge pages can reduce TLB pressure (fewer entries needed).

3) Page faults: the good kind and the bad kind

A page fault occurs when the CPU can’t translate a virtual address with the required permissions.

Common cases:

  • Demand paging (good): the page exists conceptually but isn’t loaded yet.
  • Copy-on-write (good): fork() shares pages read-only until a write.
  • Protection fault (bad): touching unmapped memory or violating permissions.

You typically see protection faults as:

  • SIGSEGV (segmentation fault)
  • SIGBUS (bad memory access, e.g. some mmap issues)

4) Memory mapping: heap vs mmap

Most allocators use a combination of:

  • brk/sbrk for extending the traditional heap
  • mmap for larger allocations and arenas

Files can also be mapped with mmap, turning file I/O into page faults and cache effects.

5) Practical experiment: force a fault

Below are small experiments showing:

  • reading/writing mapped pages
  • triggering faults by accessing unmapped memory

C: allocate a page, change permissions, then fault

#define _GNU_SOURCE
#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>

int main(void) {
    long page = sysconf(_SC_PAGESIZE);

    void *p = mmap(NULL, (size_t)page, PROT_READ | PROT_WRITE,
                   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (p == MAP_FAILED) {
        fprintf(stderr, "mmap failed: %s\n", strerror(errno));
        return 1;
    }

    ((char*)p)[0] = 'A';

    if (mprotect(p, (size_t)page, PROT_READ) != 0) {
        fprintf(stderr, "mprotect failed: %s\n", strerror(errno));
        return 1;
    }

    // This write should crash with SIGSEGV on most systems.
    ((char*)p)[0] = 'B';

    munmap(p, (size_t)page);
    return 0;
}

What to observe:

  • The first write works.
  • After mprotect(PROT_READ), a write triggers a protection fault.

Zig: mmap and mprotect

const std = @import("std");
const os = std.os;

pub fn main() !void {
    const page = try os.sysconf(os._SC.PAGESIZE);

    const p = try os.mmap(
        null,
        @as(usize, @intCast(page)),
        os.PROT.READ | os.PROT.WRITE,
        os.MAP.PRIVATE | os.MAP.ANONYMOUS,
        -1,
        0,
    );
    defer os.munmap(p);

    const bytes = @as([*]u8, @ptrCast(p));
    bytes[0] = 'A';

    try os.mprotect(p, @as(usize, @intCast(page)), os.PROT.READ);

    // Typically triggers SIGSEGV.
    bytes[0] = 'B';
}

Rust: map memory and trigger a protection fault

use std::io;

fn main() -> io::Result<()> {
    unsafe {
        let page = libc::sysconf(libc::_SC_PAGESIZE) as usize;

        let p = libc::mmap(
            std::ptr::null_mut(),
            page,
            libc::PROT_READ | libc::PROT_WRITE,
            libc::MAP_PRIVATE | libc::MAP_ANON,
            -1,
            0,
        );
        if p == libc::MAP_FAILED {
            return Err(io::Error::last_os_error());
        }

        let bytes = p as *mut u8;
        *bytes = b'A';

        if libc::mprotect(p, page, libc::PROT_READ) != 0 {
            return Err(io::Error::last_os_error());
        }

        // Typically crashes with SIGSEGV.
        *bytes = b'B';

        libc::munmap(p, page);
    }

    Ok(())
}

6) Debugging and measurement tips

  • Inspect mappings: cat /proc/<pid>/maps (Linux) shows memory regions and permissions.
  • Count faults: perf stat -e page-faults,minor-faults,major-faults ....
  • mmap vs read/write: mmap can be great, but page-fault patterns matter.

7) Common pitfalls

  • Assuming pointers are “real addresses”: they are only valid in your process context.
  • Ignoring page size: alignment and mapping sizes should usually be multiples of page size.
  • Confusing SIGSEGV and SIGBUS: both are memory access signals, but causes differ.

References