Bins

Async functions receiving a proc context. The primary extension point — the thing LLMs write and compose.

Bin Signature

Every bin is a single async function:

type BinFunction = (proc: ProcContext) => Promise<number>

Takes a proc context, returns an exit code. That’s the entire contract.

ProcContext

The full API surface available to every bin. This is the “system call” interface — everything a process can do.

interface ProcContext {
  // Identity
  pid:    number
  ppid:   number

  // Arguments & environment
  argv:   string[]
  env:    Record<string, string>

  // Pre-opened I/O streams (string convenience layer)
  stdin:  Readable     // fd 0 — async iterable of lines, or .read() for bytes
  stdout: Writable     // fd 1 — .write(string | Uint8Array)
  stderr: Writable     // fd 2 — .write(string | Uint8Array)

  // Filesystem (byte-level, routed through this process's namespace)
  fs: {
    open(path: string, flags?: OpenFlags): Promise<FdNumber>
    read(fd: FdNumber, count?: number): Promise<Uint8Array>
    write(fd: FdNumber, data: Uint8Array): Promise<number>
    close(fd: FdNumber): Promise<void>
    stat(path: string): Promise<Stat>
    readdir(path: string): Promise<DirEntry[]>

    // Mutation
    mkdir(path: string): Promise<void>
    remove(path: string): Promise<void>
    rename(oldPath: string, newPath: string): Promise<void>

    // Convenience (sugar over open/read/write/close, reads until EOF)
    readFile(path: string): Promise<string>
    writeFile(path: string, data: string): Promise<void>
  }

  // Process control
  spawn(bin: string | BinFunction, argv: string[], opts?: SpawnOpts): Promise<ChildHandle>
  exec(bin: string | BinFunction, argv: string[]): never  // throws ExecSentinel
  exit(code?: ExitCode): never
  signal(pid: Pid, sig: Signal): void
  wait(pid: Pid): Promise<ExitCode>
  waitAny(): Promise<{ pid: Pid, code: ExitCode }>

  // Identity & sessions
  setuid(uid: number): void        // restricted to uid=0
  setsid(): Pid                    // create new session
  chdir(path: string): void

  // Namespace manipulation
  mount(server: Fileserver, path: string): void
  bind(oldPath: string, newPath: string): void

  // Signal handling
  on(signal: Signal, handler: () => void): void
}

SpawnOpts

interface SpawnOpts {
  stdin?:  Readable | number    // fd or stream to use as child's stdin
  stdout?: Writable | number    // fd or stream to use as child's stdout
  stderr?: Writable | number    // fd or stream to use as child's stderr
  env?:    Record<string, string>  // override env vars (merged with parent's)
  cwd?:    string               // override working directory
}

ChildHandle

interface ChildHandle {
  pid:     Pid
  stdin?:  Writable    // present only when spawn created a pipe for stdin
  stdout?: Readable    // present only when spawn created a pipe for stdout
  stderr?: Readable    // present only when spawn created a pipe for stderr
  wait(): Promise<ExitCode>
}

Decision: ChildHandle streams are optional. When a child inherits the parent’s fds (no explicit pipe wiring in SpawnOpts), there’s no separate stream to interact with — parent and child share the same underlying fd. Streams are only present when spawn() created pipes for that fd. This matches Node.js child_process.spawn behavior with 'pipe' vs 'inherit' stdio options.

Decision: no kill() on ChildHandle. Use proc.signal(child.pid, 'SIGTERM') instead. A convenience child.kill() hides what’s happening. The explicit signal call is clearer and matches the Unix mental model LLMs already have.

What’s NOT on ProcContext

fork() — deferred; spawn covers all MVP use cases
pipe() — kernel creates pipes when wiring pipelines; bins don’t create raw pipes. If needed later, it can be added
dup()/dup2() — available as a kernel operation (used by the shell for redirect handling like 2>&1), but not exposed on ProcContext. Bins use spawn opts for fd wiring; only the shell needs raw dup

Decision: keep ProcContext minimal. Only include what bins actually need. Every method is a concept an LLM must understand to use the system. Fewer methods → lower cognitive overhead → easier for LLMs to write correct bins. Methods can be added later if real use cases demand them.

Bin Resolution

When the shell encounters a command name, it resolves to a bin function:

1. Shell builtins (cd, export, exit, etc.)
   → handled directly by the shell interpreter, not spawned

2. Absolute/relative path (/bin/grep, ./myscript)
   → open via namespace, read, evaluate

3. $PATH search
   → split $PATH on ':', try each directory
   → default $PATH = "/bin"
   → first match wins

For each candidate file, the kernel tries multiple dispatch strategies: ExecCapable (native JS function), shebang (#!), and extension-based interpreter dispatch (/lib/interp/<ext>). See the Executable Dispatch section in the kernel docs for the full resolution chain.

Decision: $PATH is colon-separated, searched left to right, /bin is the default. Matches Unix convention exactly. No $PATH → falls back to /bin only. An LLM can export PATH="/bin:/home/tools" to add custom search directories. This is the mechanism for “extending the shell” — put a function in a directory on $PATH.

Resolution Errors

Situation	Exit code	stderr message
Command not found in any $PATH dir	127	`sh: foo: command not found`
File found but not a valid function	126	`sh: foo: not executable`

Shell Builtins

Builtins are BinFunctions that the shell calls with its own ProcContext instead of spawning a child process. No separate type — same (proc: ProcContext) => Promise<ExitCode> signature.

MVP builtins:

Builtin	Behavior	ProcContext method used
`cd [dir]`	Change cwd. No args → `$HOME`.	`proc.chdir(path)`
`pwd`	Print cwd to stdout.	`proc.stdout.write(proc.env.PWD)`
`export VAR=val`	Set env var, visible to children.	`proc.env.KEY = val`
`unset VAR`	Remove env var.	`delete proc.env.KEY`
`exit [code]`	Exit the shell with code (default 0).	`proc.exit(code)`
`true`	Return 0.	—
`false`	Return 1.	—
`echo [args...]`	Write args to stdout. Also a `/bin/echo`.	`proc.stdout.write(...)`

source is NOT a builtin — it’s a shell-internal special form (like if/for) that feeds file contents into the shell interpreter. It needs the interpreter, not just ProcContext.

Decision: builtins are BinFunction, no separate ShellBuiltin type. The only difference is execution path: called in-process vs spawned as child. Same contract means bins can be “promoted” to builtins trivially (e.g., echo).

Decision: echo is both a builtin and a bin. The builtin handles the common case without spawn overhead. The /bin/echo exists so it works in xargs echo and similar contexts where a real executable is needed. Bash does the same thing.

Decision: source evaluates shell commands, not TypeScript. source /etc/profile reads a file and feeds it to the shell interpreter line by line. It’s for shell scripts (setting env vars, defining aliases), not for loading TypeScript modules. Runtime bin creation (echo 'async (proc) => {...}' > /bin/foo) is the mechanism for adding TypeScript code.

Core Bins

The minimum set that makes an LLM productive in a shell. Organized by function.

I/O

Bin	Purpose	MVP flags
`cat [files...]`	Concatenate files to stdout. No args → copy stdin to stdout.	—
`tee [file]`	Copy stdin to stdout AND to file.	`-a` (append)
`head [-n N]`	First N lines of stdin (default 10).	`-n`
`tail [-n N]`	Last N lines of stdin (default 10).	`-n`

Filters

Bin	Purpose	MVP flags
`grep pattern [file]`	Filter lines matching regex.	`-i` (case-insensitive), `-v` (invert), `-c` (count)
`sed expr [file]`	Stream editor. MVP: `s/pat/rep/` substitution only.	—
`awk program [file]`	Pattern/action processor.	—

Decision: include sed and awk in MVP, but with reduced scope. sed supports s/pattern/replacement/[g] only — no hold space, no multi-line, no labels. awk supports field splitting, pattern matching, print, and variables — no arrays, no user functions. These cover 90% of real usage. Full implementations can replace them later. The alternative — omitting them — would force LLMs into verbose workarounds for common text transformations.

Filesystem

Bin	Purpose	MVP flags
`ls [path]`	List directory contents. Default: cwd.	`-l` (long format), `-a` (show dotfiles)
`cp src dest`	Copy file.	—
`mv src dest`	Move/rename file.	—
`rm path`	Remove file.	`-r` (recursive for dirs)
`mkdir path`	Create directory.	`-p` (create parents)
`touch path`	Create empty file or update mtime.	—
`find path [-name pattern]`	Search for files.	`-name`, `-type`

Text Processing

Bin	Purpose	MVP flags
`wc [file]`	Count lines, words, bytes.	`-l`, `-w`, `-c`
`sort [file]`	Sort lines.	`-r` (reverse), `-n` (numeric)
`uniq [file]`	Deduplicate adjacent lines.	`-c` (count)
`tr SET1 [SET2]`	Translate/delete characters.	`-d` (delete)
`cut -dDELIM -fFIELDS`	Extract fields from lines.	`-d`, `-f`

Process Management

Bin	Purpose
`ps`	List processes (reads from /proc).
`kill [-SIG] pid`	Send signal to process. Default SIGTERM.

Utility

Bin	Purpose
`env`	Print all env vars.
`xargs cmd [args]`	Read stdin lines, pass as args to cmd.
`test expr` / `[`	Evaluate conditional expression. Returns 0 (true) or 1 (false).
`sleep N`	Wait N seconds.
`seq START [END]`	Print sequence of numbers.

Decision: 26 bins + 8 builtins for MVP. This set covers file manipulation, text processing, process management, and composition utilities. An LLM can accomplish most shell tasks with these. Missing tools can be written at runtime — that’s the whole point of the system. The bar for inclusion is “would an LLM need to reinvent this in most sessions?”

Bin Implementation Pattern

Every bin follows the same structure:

async function grep(proc: ProcContext): Promise<number> {
  // 1. Parse args
  const args = parseArgs(proc.argv.slice(1), {
    boolean: ['i', 'v', 'c'],
  })

  if (args._.length === 0) {
    await proc.stderr.write('usage: grep pattern [file]\n')
    return 2
  }

  const pattern = new RegExp(args._[0], args.i ? 'i' : '')
  let matchCount = 0

  // 2. Get input (file arg or stdin)
  const input = args._[1]
    ? lines(proc, args._[1])  // helper: open file, yield lines
    : proc.stdin

  // 3. Process
  for await (const line of input) {
    const matches = pattern.test(line)
    if (matches !== Boolean(args.v)) {
      matchCount++
      if (!args.c) {
        await proc.stdout.write(line + '\n')
      }
    }
  }

  // 4. Final output
  if (args.c) {
    await proc.stdout.write(matchCount + '\n')
  }

  // 5. Exit code
  return matchCount > 0 ? 0 : 1
}

Arg Parsing

Decision: provide a lightweight parseArgs utility, not a framework. Bins parse their own args from proc.argv. A minimal parseArgs(args, opts) helper handles --flags, -f, and positional args. No dependency on external libraries. Bins with trivial args (echo, cat) can just use proc.argv directly.

interface ParseArgsOpts {
  boolean?: string[]    // flags that take no value
  string?: string[]     // flags that take a value
  default?: Record<string, unknown>
}

function parseArgs(args: string[], opts?: ParseArgsOpts): {
  _: string[]                    // positional args
  [flag: string]: string | boolean | string[]
}

Input Helper

A common pattern — read from a file if one is given, otherwise from stdin:

async function* lines(proc: ProcContext, path: string): AsyncGenerator<string> {
  const fd = await proc.fs.open(path, { read: true })
  const reader = new LineReader(fd, proc)  // same as stdin's line splitter
  try {
    yield* reader
  } finally {
    await proc.fs.close(fd)
  }
}

This should live as a shared utility available to all bins, not reimplemented per-bin.

Runtime Bin Creation

LLMs extend the system by writing scripts to /bin/. Two approaches:

Shell scripts

Write a shell script with a shebang:

printf '#!/bin/sh\nwc -l "$1" | xargs printf "%s lines in %s\\n"\n' > /bin/linecount
chmod +x /bin/linecount
linecount /etc/motd

JavaScript scripts

Write a .js file — no shebang needed (the extension-based interpreter dispatch handles it automatically via /lib/interp/js → /bin/js):

printf 'const text = await proc.fs.readFile(proc.argv[2])
const lines = text.split("\\n").slice(0, 5).join("\\n")
await proc.stdout.write(lines + "\\n")
' > /bin/summarize.js
chmod +x /bin/summarize.js
summarize.js /data/huge.log

JS scripts run with proc (a full ProcContext) in scope. await works at the top level. See Executable Dispatch in the kernel docs.

Via packages

Packages can ship bins as files. pkg install auto-chmods files installed to /bin/ and files with shebangs:

{
  name: 'my-tools',
  files: {
    '/bin/hello.js': 'await proc.stdout.write("Hello!\\n")\n',
    '/bin/greet': '#!/bin/sh\necho "Hi, $1"\n',
  }
}

What Can Runtime Bins Access?

Runtime bins receive the same ProcContext as built-in bins. No difference in capability. They can:

Read/write files
Spawn child processes
Mount fileservers
Read stdin, write stdout/stderr

Decision: runtime bins are NOT sandboxed beyond ProcContext. They run with the same privileges as built-in bins — full namespace access. Sandboxing is a future concern handled by namespace scoping (mount a restricted view before spawning) or fileserver middleware. The runtime creation path trusts the code because the LLM is both the author and the user.

Sandbox Boundary & Threat Model

Runtime bins are evaluated via new Function(...) and receive only a ProcContext argument. They do not receive require, import, or references to Node/browser globals. However, this is a convention, not a security boundary:

In-process evaluation means runtime bins share the JavaScript heap. A determined author can access globalThis, process (Node), or window (browser) via constructor chain traversal (e.g., this.constructor.constructor('return globalThis')()).
The intended threat model is cooperative, not adversarial. The LLM is both the author and the consumer of runtime bins. The system trusts the code it generates. If the LLM writes process.exit(1) in a runtime bin, that’s a bug in the LLM’s output, not a security exploit.
For adversarial isolation (e.g., running untrusted user-provided bins), use the worker-based execution path (post-MVP) or host-level sandboxing (Node’s vm.createContext, browser iframe isolation, or a separate V8 isolate). The ProcContext proxy over message-passing gives workers a clean privilege boundary.
Namespace scoping is the primary containment mechanism. A process that can only see a restricted namespace (e.g., no /proc, no /dev, limited /bin) has limited capability regardless of what JavaScript tricks it uses. Mount a restricted view before spawning untrusted code.

Worker-Based Bins

CPU-heavy or long-running bins can run in a Web Worker / worker_thread for real parallelism.

Decision: worker bins use the same ProcContext interface, proxied via message passing. A bin doesn’t know or care whether it’s running in the main thread or a worker. The kernel detects opts.worker: true on spawn and sets up a message-passing bridge that proxies all ProcContext methods:

// In the main thread (kernel side):
// - Creates a Worker
// - Sends the bin source and argv
// - Proxies all proc.fs.*, proc.stdin.*, etc. over postMessage
// - Collects the exit code when the worker finishes

// In the worker:
// - Receives a ProcContext proxy
// - Calls proc.stdout.write("hello") → postMessage to main → kernel writes → postMessage back
// - Returns exit code

await proc.spawn(heavyBin, ['arg'], { worker: true })

The proxy adds latency to every I/O call (postMessage round-trip), so workers are only worthwhile for bins that do significant computation between I/O calls. Text-processing bins (grep, sed, awk) should stay cooperative.

Decision: defer worker implementation to post-MVP. The interface is designed for it (same ProcContext), but the proxying layer is substantial engineering. All MVP bins run cooperatively on the main event loop. Worker support can be added without changing any bin code — just the spawn path.