This repository runs upstream conformance test suites against pinned builds of Elide and publishes per-version Markdown/JSON reports. Test262 (TC39's JavaScript conformance suite) is the first suite; the harness is workload-neutral so more suites — and, later, benchmarks — slot in.
Next suites are added in this order:
wpt-wintertc- a sparse Web Platform Tests subset for WinterTC / ECMA-429 JavaScript-facing APIs, including fetch.cpython-core- CPython 3.12 top-levelLib/test/test_*modules and packages.javac-jtreg- full OpenJDK langtoolstools/javaccoverage through jtreg, compiling with Elide and initially running generated programs with a regular JDK.
Tracking: Compliance Testing meta (WHIPLASH#1172), Test262 (WHIPLASH#1173).
Caution
Full enabled slices may be RED until expectations are ratcheted and failures are classified.
Generated by bun run testsuite --update-summaries from the latest committed reports.
Last updated: 2026-06-28T04:38:17.260Z
| Suite | Version | Digest | Pass rate | Regressions |
|---|---|---|---|---|
| cpython-core | 1.3.5+20260628.3b80cf3 |
3583aceded8e |
86.7% | 0 |
| javac-jtreg | 1.3.5+20260628.3b80cf3 |
3583aceded8e |
94.8% | 0 |
| test262 | 1.3.5+20260628.3b80cf3 |
3583aceded8e |
82.4% | 0 |
| wpt-wintertc | 1.3.5+20260628.3b80cf3 |
3583aceded8e |
36.3% | 0 |
A Bun/TypeScript harness drives each suite through its best native runner and
normalizes the results. Test262 runs via the stock test262-harness + eshost
using a custom eshost elide host (installed through a Bun patch). Each run
executes inside a combined Docker image (a pinned Elide + Node + Bun), so every
elide run is a fast local subprocess. Results are compared against a
checked-in expectations baseline — a run is green iff there are no
regressions.
- Docker (the runner pulls/builds the Elide image)
- Bun
- Suite submodules and Bun dependencies prepared with one command:
bun run setup
bun run setup installs root and harness dependencies, initializes the upstream
suite submodules, and applies the sparse checkouts used by WPT and OpenJDK. To
only prepare submodules, run bun run setup:suites. To check that already
prepared submodules are populated correctly, run bun run setup:suites --check.
# Full suite against the current Elide nightly, with a live ✅/❌ test log
bun run testsuite --elide nightly --suite test262 --log
# A quick slice (finishes in seconds) — scope to any glob
bun run testsuite --elide nightly --log --include 'test/language/types/boolean/**/*.js'
# Current broad-suite smoke commands. These may be RED until expectations are
# ratcheted, but they should emit upstream case/subtest results instead of
# runner setup failures.
bun run testsuite --elide nightly --suite wpt-wintertc --include 'url/urlsearchparams-constructor.any.js' --log
bun run testsuite --elide nightly --suite cpython-core --include 'test_json' --log
bun run testsuite --elide nightly --suite javac-jtreg --include 'tools/javac/IllDefinedOrderOfInit.java' --log
# Broader slices after the smoke path is stable.
bun run testsuite --elide nightly --suite wpt-wintertc --threads 8 --log
bun run testsuite --elide nightly --suite cpython-core --threads 8 --log
bun run testsuite --elide nightly --suite javac-jtreg --threads 4 --log
# Run every registered suite after building the harness image once.
bun run testsuite --elide nightly --all-suites --log
# Prepare selected suite submodules first, then run. Useful in fresh clones or CI.
bun run testsuite:ready --elide nightly --all-suites --log
# Run a subset with a comma-separated suite list.
bun run testsuite --elide nightly --suite wpt-wintertc,cpython-core --log
# Run multiple selected suites at once. --threads is per suite; --suite-workers
# controls how many suite containers run concurrently.
bun run testsuite --elide nightly --all-suites --suite-workers 2 --threads 4 --log
# Update reports plus the generated README compatibility summary.
bun run testsuite --elide nightly --all-suites --log --update-summaries
# Pin a specific build: image tag, digest, or a local Elide install directory
bun run testsuite --elide ghcr.io/elide-dev/elide@sha256:…
bun run testsuite --elide /path/to/elide-install # dir containing bin/elide + lib/./bin/run ... remains as a direct executable alias for the same Bun/TypeScript
launcher.
By default, the launcher builds a total concurrency budget of
available CPUs * 2. It then runs up to that many suite workers, capped by the
number of selected suites, and divides the remaining budget into per-suite
threads. --threads N overrides the per-suite adapter thread count: Test262
forwards it to test262-harness, wpt-wintertc runs multiple WPT files at once,
cpython-core splits selected modules across worker processes, and
javac-jtreg forwards it to jtreg's native -concurrency:N. --suite-workers N
overrides how many suite containers run concurrently. CONCURRENCY_MULTIPLIER
or --concurrency-multiplier N can change the default multiplier from 2.
--log streams one normalized mark per completed test to stderr (✅ pass ·
❌ fail · 🛑 error · ⊘ skip); the summary line goes to stdout. Add
--verbose to also mirror raw runner stdout/stderr, which is useful for
debugging harness behavior. Failure messages are persisted in reports either
way; console printing can be controlled with --failure-output show|hide or the
aliases --show-failure-output / --hide-failure-output. The exit code is
non-zero on any regression.
Committed under reports/<elide-version>/<short-digest>/<workload>/:
| file | purpose |
|---|---|
<workload>.md |
human rollup: pass-rate, regressions, new passes |
impact.md / impact.json |
failures clustered by root-cause signature, ranked by blast radius |
changes.md / changes.json |
diff vs the previous run (fixed / regressed / added / removed) |
summary.json |
counts + regression/new-pass ids (machine) |
results.json.gz |
every test's status (machine, for cross-version diffs) |
pass-rate.svg |
static pass/fail/error/skip chart for the run |
reports/index.md (+ index.json + pass-rate.svg) is the top-level matrix
of the latest run per suite. Machine-readable index entries point at the
workload-scoped report directory. Published via GitHub Pages.
Passing --update-summaries also refreshes the generated compatibility block in
this README from the latest report index. This is intended for mainline bot runs:
run all suites, commit the changed reports/, expectations/*.ratchet.toml if
ratcheting, and README summary content, then push.
expectations/test262.toml is the hand-curated baseline:
[skip] # excluded from the run, with a reason
"test/intl402/**" = "Intl 402 not supported yet"
[fail] # expected failures (link an issue)
"test/built-ins/RegExp/property-escapes/**" = "partial (…)"A run is green iff actual matches expected: new failures are regressions (red, fail CI); tests that newly pass are surfaced as new passes (advance the baseline). To accept the current failure set as the baseline (so only new breakage fails CI):
bun run testsuite --elide nightly --ratchetThis regenerates the machine-owned expectations/<workload>.ratchet.toml
(exact test ids) from the current failures. compare treats a test as
expected-fail if it matches a [fail] glob or is in the ratchet set;
[skip] always wins.
A derived SQLite database (.harness/results.sqlite, gitignored) is built by
ingesting the committed results.json.gz files, and powers ad-hoc queries:
bun run harness/src/cli.ts db build # (re)build the DB from committed runs
bun run harness/src/cli.ts impact <workload> [semver] [digest] # impact-ordered failures for a run
bun run harness/src/cli.ts diff <workload> [A] [B] # version diff (default: two most recent)
bun run harness/src/cli.ts query "SELECT …" # read-only SQL over the resultsThe committed JSON files are the source of truth and the contract for a future static web UI; the SQLite DB is a disposable local index.
suites/test262/ Test262 (git submodule)
manifests/ curated suite slices for WPT, CPython, and jtreg
suites/drivers/ small bridge/wrapper programs used by external suites
expectations/ workload baselines + machine ratchets
reports/ committed per-version reports + top-level index
harness/ the Bun/TypeScript harness (src/, fixtures/, patches/)
docker/ harness images (image-ref + local install dir)
bin/run.ts Bun/TypeScript launcher: resolve --elide → docker build → docker run
bin/run executable alias for bin/run.ts
.devcontainer/ Codespaces dev environment
.github/workflows/ nightly + on-demand compliance runs, Pages publish
docs/superpowers/ design specs and implementation plans
bun install # installs root TypeScript/Bun types for bin/run.ts
bun run typecheck # type-check the host launcher
cd harness
bun install # applies the eshost `elide` host patch
bun test # unit tests
bun run typecheckOr open the repo in a GitHub Codespace / VS Code devcontainer, which provisions Bun, Node, Docker-in-Docker, and Elide automatically.