Testing
Three layers: unit tests for individual components, integration tests for end-to-end shell sessions, and conformance tests measuring coverage against established shell/coreutils behavior.
Unit Tests
Standard vitest files alongside source code. One test file per module.
test/ bins/text/grep.test.ts # tests grep in isolation kernel/fd.test.ts # tests fd table operations shell/expansion/... # tests each expansion phaseEach test creates a minimal environment (kernel + memoryFS + binFS), spawns a BinFunction, and asserts on stdout/stderr/exit code. The pattern:
const { kernel } = makeTestEnv()const bin: BinFunction = async (proc) => { /* ... */ }const handle = await kernel.spawn(bin, ['sh'], { ppid: 0, env: { PATH: '/bin' } })const exitCode = await handle.wait()Run: npx vitest run
Integration Tests
Full shell sessions exercising pipelines, redirects, control flow, and command composition. Located at test/integration/.
async function runScript(script: string): Promise<{ stdout: string; stderr: string; exitCode: ExitCode }>runScript boots a kernel, creates an interpreter, parses the script, executes it, and captures output. Tests look like real shell usage:
it('word frequency counter pipeline', async () => { const result = await runScript( 'echo "the cat sat on the mat" | tr " " "\\n" | sort | uniq -c | sort -rn | head -n 1' ) expect(result.stdout.trim()).toMatch(/3\s+the/)})Conformance Tests
External-format test suites that measure what fishbowl covers and what it doesn’t. Uses the Oils spec format — a simple DSL designed for cross-shell comparison.
Format
#### test nameecho hello | tr a-z A-Z## STDOUT:HELLO## ENDEach test case starts with ####. Expected output uses ## STDOUT: / ## END blocks. Exit codes use ## status: N. Single-line assertions use ## stdout: text.
Tests that fishbowl doesn’t support are marked ## N-I fishbowl and generate it.skip. These are the coverage gaps — the actionable TODO list.
File Organization
test/spec/ runner.ts # Oils format parser + runScript executor runner.test.ts # Unit tests for the parser itself spec.test.ts # vitest entry point — discovers and runs all .test.sh files shell/ # Shell conformance echo.test.sh # builtin echo behavior loop.test.sh # for, while, until, break, continue pipeline.test.sh # pipes, multi-stage, exit codes redirect.test.sh # >, >>, <, <<, <<< glob.test.sh # *, ?, [], quoting suppression case.test.sh # case/esac pattern matching var-op.test.sh # ${var:-default}, ${#var}, etc. if.test.sh # if/elif/else, test conditions quote.test.sh # single quotes, double quotes, backslash word-split.test.sh # IFS splitting, quoting preservation coreutils/ # Per-command conformance grep.test.sh # -i, -v, -c, -o, regex, exit codes sed.test.sh # s/old/new/, s///g, stdin awk.test.sh # $1, NR, NF, BEGIN/END, arithmetic sort.test.sh # -n, -r, -u, combined flags ... (one file per command)Runner
spec.test.ts auto-discovers .test.sh files in shell/ and coreutils/ at test time. Each file becomes a describe() block, each #### becomes an it(). The runner:
- Parses the spec file into
SpecCase[](name, script, expectedStdout, expectedStatus, skip) - For each case, boots a fresh kernel with memoryFS + binFS + all bins/builtins
- Executes the script through the shell interpreter
- Compares stdout and exit code against expectations
- Tracks pass/fail/skip per file for the coverage report
Coverage Report
After all spec tests complete, a summary prints:
┌──────────────────────────────────────────────────┐│ fishbowl Conformance Report │├──────────────────────────────────────────────────┤│ Shell: 95/ 95 pass (100%) 15 skip 0 fail││ Coreutils: 147/147 pass (100%) 7 skip 0 fail││ Total: 242/242 pass (100%) 22 skip 0 fail│├──────────────────────────────────────────────────┤│ grep 9/ 9 sed 7/ 7 sort 7/ 7 ││ awk 9/ 9 head 6/ 6 tail 6/ 6 ││ ... │└──────────────────────────────────────────────────┘The skip count is the metric that matters — it’s the gap between “what users expect” and “what we provide.” Each skipped test is a known, documented limitation.
Why Oils Format
Decision: Oils spec format over BusyBox testing() or plain vitest. Oils tests are designed for multi-shell comparison, have a clean parseable format, and test exactly the shell behaviors we care about. The format is trivial to parse (~100 lines), human-readable, and works as both documentation and test. BusyBox’s
testing()format is simpler but has fewer tests and less shell-level coverage. Plain vitest gives us no external comparison baseline — the whole point is measuring ourselves against established expectations.
What We Don’t Test
- Bash-only features — arrays,
declare,[[ ]], process substitution,PIPESTATUS - Interactive features — job control, readline in test context, signal handling
- Real filesystem — symlinks, hard links, permissions (we use memoryFS)
- Binary I/O — tests using
stdout-jsonwith\u0000are skipped
Adding New Tests
To test a new command or shell feature:
- Create
test/spec/coreutils/mycommand.test.sh(ortest/spec/shell/feature.test.sh) - Write test cases in the
####/## STDOUT:format - Mark unsupported behaviors with
## N-I fishbowl - Run
npx vitest run test/spec/spec.test.ts— the file is auto-discovered
To close a coverage gap:
- Implement the missing feature
- Remove the
## N-I fishbowlmarker from the test - Run the suite — the test should now pass
Current Coverage Gaps (N-I fishbowl)
Shell:
- Redirect tests in memoryFS test environment (path resolution)
- Glob expansion on memoryFS-created files
- Empty unquoted variable collapsing
Coreutils:
sed -nwith/p,dcommand, multiple-eawk -Fdelimiter flaggrep -nline numberscat -nline numberscut -ccharacter ranges