Testing

Three layers: unit tests for individual components, integration tests for end-to-end shell sessions, and conformance tests measuring coverage against established shell/coreutils behavior.

Unit Tests

Standard vitest files alongside source code. One test file per module.

test/
  bins/text/grep.test.ts     # tests grep in isolation
  kernel/fd.test.ts          # tests fd table operations
  shell/expansion/...        # tests each expansion phase

Each test creates a minimal environment (kernel + memoryFS + binFS), spawns a BinFunction, and asserts on stdout/stderr/exit code. The pattern:

const { kernel } = makeTestEnv()
const bin: BinFunction = async (proc) => { /* ... */ }
const handle = await kernel.spawn(bin, ['sh'], { ppid: 0, env: { PATH: '/bin' } })
const exitCode = await handle.wait()

Run: npx vitest run

Integration Tests

Full shell sessions exercising pipelines, redirects, control flow, and command composition. Located at test/integration/.

async function runScript(script: string): Promise<{ stdout: string; stderr: string; exitCode: ExitCode }>

runScript boots a kernel, creates an interpreter, parses the script, executes it, and captures output. Tests look like real shell usage:

it('word frequency counter pipeline', async () => {
  const result = await runScript(
    'echo "the cat sat on the mat" | tr " " "\\n" | sort | uniq -c | sort -rn | head -n 1'
  )
  expect(result.stdout.trim()).toMatch(/3\s+the/)
})

Conformance Tests

External-format test suites that measure what fishbowl covers and what it doesn’t. Uses the Oils spec format — a simple DSL designed for cross-shell comparison.

Format

#### test name
echo hello | tr a-z A-Z
## STDOUT:
HELLO
## END

Each test case starts with ####. Expected output uses ## STDOUT: / ## END blocks. Exit codes use ## status: N. Single-line assertions use ## stdout: text.

Tests that fishbowl doesn’t support are marked ## N-I fishbowl and generate it.skip. These are the coverage gaps — the actionable TODO list.

File Organization

test/spec/
  runner.ts              # Oils format parser + runScript executor
  runner.test.ts         # Unit tests for the parser itself
  spec.test.ts           # vitest entry point — discovers and runs all .test.sh files
  shell/                 # Shell conformance
    echo.test.sh         # builtin echo behavior
    loop.test.sh         # for, while, until, break, continue
    pipeline.test.sh     # pipes, multi-stage, exit codes
    redirect.test.sh     # >, >>, <, <<, <<<
    glob.test.sh         # *, ?, [], quoting suppression
    case.test.sh         # case/esac pattern matching
    var-op.test.sh       # ${var:-default}, ${#var}, etc.
    if.test.sh           # if/elif/else, test conditions
    quote.test.sh        # single quotes, double quotes, backslash
    word-split.test.sh   # IFS splitting, quoting preservation
  coreutils/             # Per-command conformance
    grep.test.sh         # -i, -v, -c, -o, regex, exit codes
    sed.test.sh          # s/old/new/, s///g, stdin
    awk.test.sh          # $1, NR, NF, BEGIN/END, arithmetic
    sort.test.sh         # -n, -r, -u, combined flags
    ... (one file per command)

Runner

spec.test.ts auto-discovers .test.sh files in shell/ and coreutils/ at test time. Each file becomes a describe() block, each #### becomes an it(). The runner:

Parses the spec file into SpecCase[] (name, script, expectedStdout, expectedStatus, skip)
For each case, boots a fresh kernel with memoryFS + binFS + all bins/builtins
Executes the script through the shell interpreter
Compares stdout and exit code against expectations
Tracks pass/fail/skip per file for the coverage report

Coverage Report

After all spec tests complete, a summary prints:

┌──────────────────────────────────────────────────┐
│           fishbowl Conformance Report             │
├──────────────────────────────────────────────────┤
│  Shell:       95/ 95 pass (100%)  15 skip  0 fail│
│  Coreutils:  147/147 pass (100%)   7 skip  0 fail│
│  Total:      242/242 pass (100%)  22 skip  0 fail│
├──────────────────────────────────────────────────┤
│  grep    9/ 9    sed     7/ 7    sort   7/ 7    │
│  awk     9/ 9    head    6/ 6    tail   6/ 6    │
│  ...                                             │
└──────────────────────────────────────────────────┘

The skip count is the metric that matters — it’s the gap between “what users expect” and “what we provide.” Each skipped test is a known, documented limitation.

Why Oils Format

Decision: Oils spec format over BusyBox testing() or plain vitest. Oils tests are designed for multi-shell comparison, have a clean parseable format, and test exactly the shell behaviors we care about. The format is trivial to parse (~100 lines), human-readable, and works as both documentation and test. BusyBox’s testing() format is simpler but has fewer tests and less shell-level coverage. Plain vitest gives us no external comparison baseline — the whole point is measuring ourselves against established expectations.

What We Don’t Test

Bash-only features — arrays, declare, [[ ]], process substitution, PIPESTATUS
Interactive features — job control, readline in test context, signal handling
Real filesystem — symlinks, hard links, permissions (we use memoryFS)
Binary I/O — tests using stdout-json with \u0000 are skipped

Adding New Tests

To test a new command or shell feature:

Create test/spec/coreutils/mycommand.test.sh (or test/spec/shell/feature.test.sh)
Write test cases in the #### / ## STDOUT: format
Mark unsupported behaviors with ## N-I fishbowl
Run npx vitest run test/spec/spec.test.ts — the file is auto-discovered

To close a coverage gap:

Implement the missing feature
Remove the ## N-I fishbowl marker from the test
Run the suite — the test should now pass

Current Coverage Gaps (N-I fishbowl)

Shell:

Redirect tests in memoryFS test environment (path resolution)
Glob expansion on memoryFS-created files
Empty unquoted variable collapsing

Coreutils:

sed -n with /p, d command, multiple -e
awk -F delimiter flag
grep -n line numbers
cat -n line numbers
cut -c character ranges