Skip to content

Pipes

Paired read/write ends for inter-process communication. The primary mechanism for composing bins into pipelines.

Relationship to Fileservers

Decision: pipes are NOT fileservers. A fileserver implements the 9-method protocol and is mounted at a namespace path. A pipe is an internal kernel primitive — it’s never mounted, never opened by path, never stat’d. It produces two PipeFd entries directly, one for reading and one for writing. Making pipes conform to the fileserver protocol would add overhead and complexity for no benefit — nothing ever needs to open("/pipe") or readdir a pipe.

A pipe is created by the kernel’s pipe() call, which returns two Fd entries:

kernel.pipe(): [Fd, Fd] // [readEnd, writeEnd]
// Read end: PipeFd { kind: 'pipe', pipe: state, mode: 'r' }
// Write end: PipeFd { kind: 'pipe', pipe: state, mode: 'w' }

The Fd type is a discriminated union (FileserverFd | PipeFd). The kernel dispatches on fd.kind'server' routes to fileserver methods, 'pipe' routes to pipe read/write logic. TypeScript narrows correctly on the kind field.

Buffer

Decision: bounded buffer, 64KB default. Unbounded buffers risk runaway memory when a fast writer outpaces a slow reader. 64KB matches the default Linux pipe capacity and is large enough to avoid excessive context-switching between producer and consumer. The size can be made configurable per-pipe if needed later.

interface Pipe {
buffer: Uint8Array[] // queue of chunks
bufferSize: number // current total bytes in buffer
capacity: number // max bytes (default 65536)
closed: { read: boolean, write: boolean }
// Async coordination
readers: Array<(data: Uint8Array) => void> // blocked read resolvers
writers: Array<() => void> // blocked write resolvers
}

The buffer is a queue of Uint8Array chunks, not a contiguous ring buffer. Chunks are enqueued whole and dequeued whole or sliced. This avoids copy overhead from ring buffer wrapping and is simpler to implement. A ring buffer would only matter at volumes that don’t apply to our use case (text processing for LLM agents, not binary streaming).

Backpressure

When the buffer is full, write() returns a promise that blocks until space is available:

write(data):
1. If read end is closed → throw EPIPE
2. If bufferSize + data.length <= capacity:
- Enqueue chunk, update bufferSize
- If any readers are blocked → wake one, deliver data
- Return data.length
3. Else:
- Return a promise that resolves when space is freed
- When reader drains enough → enqueue remaining, resolve

This gives us natural backpressure through async/await. A fast cat piped into a slow grep will have cat’s writes suspend until grep reads.

EOF Propagation

Closing the write end signals EOF to the read end:

writeEnd.close():
1. Set closed.write = true
2. If any readers are blocked → wake them with empty Uint8Array (EOF signal)
3. Buffer is NOT flushed — remaining data is still readable
read(count):
1. If buffer has data → return chunk (up to count bytes)
2. If buffer is empty AND write end closed → return empty Uint8Array (EOF)
3. If buffer is empty AND write end open → block (return promise)

The empty Uint8Array (length 0) is the EOF signal. Bins consuming stdin as an async iterable will see the iterator complete when they get EOF.

Broken Pipe (EPIPE)

When the read end closes while the write end is still open:

readEnd.close():
1. Set closed.read = true
2. Discard buffer contents (no one will read them)
3. If any writers are blocked → reject with EPIPE
4. Future writes → throw EPIPE immediately

Decision: EPIPE throws, writer must handle or die. Following Unix convention: writing to a pipe with no reader raises SIGPIPE (which we deliver as EPIPE). The default signal handler kills the process with exit code 141. This is correct behavior — in head -1 | cat hugefile, cat should die when head closes its stdin, not spin writing into the void.

The kernel’s SIGPIPE delivery from the kernel spec connects here: when write() throws EPIPE and the bin doesn’t catch it, the promise rejects, and the kernel treats unhandled rejection as the process dying.

Multiple Readers/Writers

Decision: single reader, single writer only. A pipe connects exactly two processes — one reads, one writes. This matches Unix semantics and avoids the complexity of fan-in/fan-out coordination (which reader gets which bytes? round-robin? broadcast?). Multiple writers to the same pipe are technically possible (parent and child both have the write end after spawn) but interleaving is undefined and not a supported pattern.

Fan-out is a userspace concern, not a pipe concern. When multiple consumers need the same data stream, the tee bin handles it: tee reads one pipe, writes to N outputs (files and/or stdout). This is the Unix answer — pipes are point-to-point, multiplexing is done by bins. Process substitution (<(cmd)) would also use this pattern if implemented later: the shell creates a pipe, spawns cmd writing to it, and passes the read end’s path as an argument.

Byte-Level vs Line-Level Buffering

Decision: byte-level buffering at the pipe layer. Line splitting is a consumer concern. The pipe deals in Uint8Array chunks. It doesn’t know or care about newlines. The convenience layer (proc.stdin as async iterable of lines) handles line splitting on top. This keeps the pipe simple and generic — it works for both text and binary data.

The convenience layer in the proc context:

// proc.stdin is an async iterable of lines (strings)
// Internally it:
// 1. Reads Uint8Array chunks from fd 0
// 2. Decodes via TextDecoder
// 3. Splits on \n, buffering partial lines
// 4. Yields complete lines
// proc.stdout.write(str) is a convenience that:
// 1. Encodes via TextEncoder
// 2. Writes Uint8Array to fd 1

This is documented here because it’s where the abstraction boundary lives, but the implementation belongs in the streams/proc context layer, not in the pipe itself.

Shell Pipeline Wiring

How the shell sets up a | b | c:

parse("a | b | c") → Pipeline AST with 3 commands
1. Create pipe1 = kernel.pipe() → [read1, write1]
2. Create pipe2 = kernel.pipe() → [read2, write2]
3. Spawn 'a':
fd 0 = inherited stdin (from shell)
fd 1 = write1
fd 2 = inherited stderr
4. Spawn 'b':
fd 0 = read1
fd 1 = write2
fd 2 = inherited stderr
5. Spawn 'c':
fd 0 = read2
fd 1 = inherited stdout (from shell)
fd 2 = inherited stderr
6. Close shell's copies of pipe fds:
close write1, read1, write2, read2
(The shell doesn't need them — each process has its own copy)
7. Wait for all three processes
Pipeline exit code = exit code of last command (c)
shell stdin shell stdout
│ ▲
▼ │
┌─────┐ write1 ──► read1 ┌─────┐ write2 ──► read2 ┌─────┐
│ a │──── pipe1 ────────│ b │──── pipe2 ────────│ c │
└─────┘ └─────┘ └─────┘

Closing shell’s copies

Step 6 is critical. After spawning, the shell must close its copies of the pipe fds. Otherwise the read end never sees EOF — it thinks the write end is still open (because the shell still holds it). This is a classic Unix pipe bug.

Pipeline exit code

Decision: exit code of the last command, matching bash default. set -o pipefail (all stages must succeed) can be added later as a shell option. For MVP, last-command-wins is simpler and matches what LLMs expect from shell experience.

Kernel pipe() Implementation

Bringing it all together:

function createPipe(capacity = 65536): [Fd, Fd] {
const state: PipeState = {
buffer: [],
bufferSize: 0,
capacity,
closed: { read: false, write: false },
readers: [],
writers: [],
}
const readEnd: PipeFd = {
kind: 'pipe',
pipe: state,
mode: 'r',
}
const writeEnd: PipeFd = {
kind: 'pipe',
pipe: state,
mode: 'w',
}
return [readEnd, writeEnd]
}

The kernel dispatches on fd.kind === 'pipe' and routes read/write/close to the pipe state directly instead of through a fileserver.