File I/O Fundamentals: read/write, Buffers, and the Cost of Syscalls
This article covers File I/O Fundamentals: read/write, Buffers, and the Cost of Syscalls. Learn the practical mechanics of file I/O: file descriptors, syscalls, buffering, partial reads/writes, and robust patterns with C, Zig, and Rust.
File I/O looks simple: call read() and write(). In practice, robust and fast file I/O requires understanding:
- file descriptors and offsets
- partial reads/writes
- buffering (user-space and kernel-space)
- syscall overhead and batching
1) File descriptors and offsets
On POSIX systems, opening a file gives you a file descriptor (fd): a small integer indexing a per-process table.
A descriptor has state, including a current offset (where the next read()/write() happens) unless you use pread()/pwrite().
2) Partial reads/writes are normal
read(fd, buf, n)may return fewer thannbytes.write(fd, buf, n)may write fewer thannbytes.
This is common for:
- pipes and sockets
- non-blocking I/O
- signals interrupting syscalls
Even for regular files, you should code defensively.
3) A robust “copy file” loop
C: copy using read/write with retry
#define _GNU_SOURCE
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
static int write_all(int fd, const unsigned char *buf, size_t n) {
size_t off = 0;
while (off < n) {
ssize_t w = write(fd, buf + off, n - off);
if (w < 0) {
if (errno == EINTR) continue;
return -1;
}
off += (size_t)w;
}
return 0;
}
int main(int argc, char **argv) {
if (argc != 3) {
fprintf(stderr, "usage: %s <src> <dst>\n", argv[0]);
return 2;
}
int in = open(argv[1], O_RDONLY);
if (in < 0) {
fprintf(stderr, "open src: %s\n", strerror(errno));
return 1;
}
int out = open(argv[2], O_WRONLY | O_CREAT | O_TRUNC, 0644);
if (out < 0) {
fprintf(stderr, "open dst: %s\n", strerror(errno));
close(in);
return 1;
}
unsigned char buf[64 * 1024];
for (;;) {
ssize_t r = read(in, buf, sizeof(buf));
if (r == 0) break; // EOF
if (r < 0) {
if (errno == EINTR) continue;
fprintf(stderr, "read: %s\n", strerror(errno));
break;
}
if (write_all(out, buf, (size_t)r) != 0) {
fprintf(stderr, "write: %s\n", strerror(errno));
break;
}
}
close(out);
close(in);
return 0;
}
Key points:
- Buffer size is a trade-off: too small → many syscalls; too big → cache pressure.
EINTRhandling matters.
Zig: buffered copying
Zig’s standard library has buffered I/O abstractions that reduce syscalls.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const a = gpa.allocator();
var args = try std.process.argsAlloc(a);
defer std.process.argsFree(a, args);
if (args.len != 3) return error.InvalidArgs;
const src_path = args[1];
const dst_path = args[2];
const cwd = std.fs.cwd();
var src = try cwd.openFile(src_path, .{});
defer src.close();
var dst = try cwd.createFile(dst_path, .{ .truncate = true });
defer dst.close();
var br = std.io.bufferedReader(src.reader());
var bw = std.io.bufferedWriter(dst.writer());
try std.io.copyAll(br.reader(), bw.writer());
try bw.flush();
}
Rust: copy with std::io
use std::env;
use std::fs::File;
use std::io::{self, BufReader, BufWriter, Read, Write};
fn main() -> io::Result<()> {
let args: Vec<String> = env::args().collect();
if args.len() != 3 {
eprintln!("usage: {} <src> <dst>", args[0]);
std::process::exit(2);
}
let src = File::open(&args[1])?;
let dst = File::create(&args[2])?;
let mut r = BufReader::new(src);
let mut w = BufWriter::new(dst);
let mut buf = [0u8; 64 * 1024];
loop {
let n = r.read(&mut buf)?;
if n == 0 { break; }
w.write_all(&buf[..n])?;
}
w.flush()?;
Ok(())
}
4) When to use pread/pwrite
pread/pwrite avoid shared file offsets and are friendly for concurrent access patterns.
- Multiple threads can read different parts of the same file without locking around a shared offset.
5) Advanced options (preview)
- mmap: map file into memory (great for random reads, but page-fault patterns matter).
- sendfile/copy_file_range: kernel-assisted copies.
- O_DIRECT: bypass page cache (specialized, easy to misuse).
References
read(2): https://man7.org/linux/man-pages/man2/read.2.htmlwrite(2): https://man7.org/linux/man-pages/man2/write.2.htmlopen(2): https://man7.org/linux/man-pages/man2/open.2.htmlpread(2): https://man7.org/linux/man-pages/man2/pread.2.html- “Linux Performance” (Brendan Gregg): http://www.brendangregg.com/linuxperf.html