Inquest, a test result repository in Rust
testrepository
For a long time I’ve used Robert Collins’ testrepository (testr) to run tests in many of the projects I work on. It’s a small, focused tool built around a simple idea: decouple the running of tests from the recording and querying of their results.
The way it works is straightforward. A test runner emits a subunit stream — a compact binary protocol for test results — and testrepository stores those streams in a per-project .testrepository/ directory. Once results are in the repository, you can ask questions like “which tests failed in the last run?”, “re-run only the failures”, “what are the slowest tests?”, or “what changed between this run and the previous one?”.
The killer feature, for me, has always been the failing-test loop. When a big test suite breaks, you don’t want to re-run the whole thing after every fix — you want to iterate on just the failures, and only re-run the full suite once they’re all green. testrepository made that workflow ergonomic long before most language-specific test runners had anything comparable, and many of them still don’t have a good answer for it.
testrepository has served me well for over a decade, but it has been largely unmaintained for a while, and I had some ideas of improvements that I wanted to try out. So I wrote a Rust port, which has since grown a number of features of its own.
Inquest
Inquest is a Rust port of testrepository that has since grown a number of features of its own. The binary is called inq.
Goals
The goals are deliberately modest:
- a single static binary, no Python runtime required
- no need to write a dedicated config file for most projects
- compatible enough with testrepository’s workflow that I can switch projects over without retraining my fingers
- a richer on-disk format that captures more about each run (git commit, command line, duration, exit code, concurrency)
- good support for the languages I actually use day-to-day: Rust, Python, Go, and Node.js
- mostly Do What I Mean (DWIM), e.g. getting me to know as quickly as possible what tests are failing and why, and being clever about doing this
Inquest reads and writes subunit v2 streams, so anything that can produce subunit (directly or via one of the many converters) can feed into it.
Quick start
Inquest can usually figure out how to run your tests on its own. In a Rust, Python, Go or Node.js project:
$ cd my-project
$ inq
Or if the auto-detection doesn’t work, you can ask it to generate a config file and then run the tests:
$ inq auto
$ inq run
inq auto writes an inquest.toml describing how to invoke the test runner; inq run runs the tests, captures the subunit stream, and stores the results in a .inquest/ directory.
For a Rust project the generated config looks like:
test_command = "cargo subunit $IDOPTION"
test_id_option = "--test $IDFILE"
test_list_option = "--list"
After the first run, the usual queries work:
$ inq stats # repository-wide statistics
$ inq last # results of the most recent run
$ inq failing # only the failing tests
$ inq slowest # the slowest tests in the last run
$ inq run --failing # re-run only what failed last time
The last one is the workflow I use most often: run the full suite once, fix the obvious failures, then iterate on inq run --failing until the list is empty.
A few things that aren’t in testrepository
Some of the features that have grown in inquest beyond the original testrepository functionality:
Timeouts. --test-timeout, --max-duration, and --no-output-timeout will kill a test process that is hanging or has stopped producing output. --test-timeout auto derives a per-test timeout from the historical duration of that test, which is handy for catching tests that hang.
Once the test runner is killed, the test is marked as failed and the next test is started, so a broken test doesn’t hold up the whole suite.
Ordering --order can be used to run tests in a specific order, e.g. to run the slowest tests first, to run the tests that failed most recently first, or to run the widest variety of tests first to maximize the chance of finding a failure early on.
Live progress. inq running tails the in-progress subunit stream on disk and reports observed/expected test counts, percent complete, elapsed wall-clock time, and an ETA derived from each test’s historical duration. Useful when a CI run is taking longer than you’d like.
Flakiness ranking. inq flaky ranks tests by pass↔fail transitions in consecutive runs in which the test was recorded, so chronically broken tests rank low and genuinely flapping tests rank high.
Comparing runs. inq diff <A> <B> shows what changed between two test runs — newly failing, newly passing, and tests that flipped state — which makes it easy to see whether your last change actually fixed (or broke) anything.
Bisecting git history. inq bisect <TEST> drives git bisect to find the commit that broke a given test. It defaults the known-good and known-bad commits from the recorded run history (the most recent run where the test passed, and the most recent where it failed), so in the common case there is no need to remember either — just point it at the test name and let it work.
Richer run metadata. inq info shows the git commit, command line, duration, exit code, and concurrency for a run, with a flag for whether the working tree was dirty when the run started. Combined with inq diff this makes it much easier to triangulate when a regression was introduced.
Rerun a previous run verbatim. inq rerun <ID> re-runs exactly the tests of a previous run, in the same order, forwarding the same -- arguments that the original run used. inq rerun -1 repeats the latest.
Web based view. inq web serves a web-based view of the repository, with a dashboard of recent runs and detailed views of individual runs and tests.
Web UI
Most of the time I drive inquest from the command line, but for browsing historical results of a large suite — spotting flapping tests, drilling into a single test’s run history, or just getting a visual sense of which parts of the suite are hurting — a web view is more pleasant. inq web starts a local server with exactly that:
$ inq web
The repository overview shows totals and a per-test history grid where each cell is one run, coloured by outcome. Bands of red make it easy to pick out tests that have been broken for a long time, and isolated red cells in an otherwise green column point at flaky tests.
Drilling into an individual test gives you its full run history, a duration sparkline, and per-run pass/fail status:
Migrating from testrepository
If you already have a .testrepository/ directory full of historical runs, inq upgrade will migrate it into the new .inquest/ format, with a progress bar for the impatient.
The legacy .testr.conf (INI) format is still understood, so existing projects don’t have to be converted to inquest.toml immediately — though the TOML format is preferred for new projects.
Trying it
The source is on GitHub at jelmer/inquest. To install from source:
$ cargo install inquest
In a project with a Rust, Python, Go or Node.js test suite:
$ inq
Bug reports and patches are welcome.