Testing with Please
Tests in Please are run using plz test
(or
plz cover
if you want to extract coverage
information).
The documentation for gentest() may be of use as well, as that explains the available arguments and what they do.
Specifying what to test
The most obvious way of running tests is to pass a build label to
plz test
, for example
plz test //src/core/...
to test all tests under a
particular package.
Labels are also particularly useful for tests; Please accepts
--include
and
--exclude
flags which filter to a subset of what
it was asked to run. These are marked onto the build rules like so:
go_test(
name = "my_test",
srcs = ["my_test.go"],
labels = ["slow", "integration"],
)
You can then invoke Please like
plz test --exclude slow
to run all tests except
those labelled as "slow".
There is one special label, manual
which always
excludes tests from autodetection, so they will only be run if identified
specifically. This is often useful to disable tests from general usage if
needed (for example, if a test starts failing, you can disable it while
investigating).
Success and failure
Please expects two things from tests:
- That the test exits with code 0 if successful, or nonzero if unsuccessful
-
That it writes its results into a file or directory defined by the env
var
RESULTS_FILE
(unless the test hasno_test_output = True
, in which case it is evaluated only on its exit code).
All tests are considered to end in a binary state where they are either successful or unsuccessful, so tests with variable output such as load tests are not a good fit for this system.
Result files are expected to either be golang style test output:
=== RUN Hello
--- PASS: Hello (0.00s)
=== RUN Salutations
--- PASS: Salutations (0.00s)
Or the junit/ant style xml test output:
<testsuites time="0">
<testsuite name="hello" tests="2" package="hello.test" time="0">
<testcase name="Hello" classname="test" time="0">
<system-err>Hello!</system-err>
</testcase>
<testcase name="Salutations" classname="test" time="0">
<system-err>Salutations!</system-err>
</testcase>
</testsuite>
</testsuites>
There's no official format for this XML structure, however we try and be compatible with Maven's surefire reports.
Tests can also be marked as flaky which causes them to be
automatically re-run several times until they pass. They are considered to
pass if any one run passes.
This is set as follows:
go_test(
name = "my_test",
srcs = ["my_test.go"],
flaky = True,
)
By default they are run up to three times, you can alter this on a per-test
basis by passing an integer instead of a boolean.
You should try to avoid marking tests as flaky though; in most cases
flakiness indicates poor design within the test which can usually be
mitigated (for example by adding locking or synchronisation instead of
sleeping, etc). Flaky tests are a burden on other team members since they
take extra time & resources to run, and there is always a risk that they
will still fail - for example, if your test fails 50% of the time, you can
expect that one in sixteen runs will fail four consecutive times which will
still result in an overall failure.
Hermeticity and reproducibility
An important concept of testing that Please tries to help with is that tests should be hermetic; that is they should be isolated from local state that might cause them to unexpectedly fail. This is important to avoid nasty surprises later (for example, finding out that your test only passes if some other resources happen to have been built first, or that it fails arbitrarily if run in parallel while others are running too).
In order to help ensure this, Please runs each test in isolation from
others. They are each run within their own temporary directory which
contains only the test itself and files it's declared that it will need.
Most of the usual attributes of a build rule that refer to files are only
available at build time (e.g. srcs
,
tools
, deps
etc).
Instead files that are needed at test runtime should be specified as
data
. These will appear within the test directory
with their full repo path.
On Linux, tests can also be sandboxed, either by setting
sandbox = True,
on the test rule in question, or by setting it globally in the config:
[test]
sandbox = on
This uses kernel namespaces to segregate it from other processes:
- Networking: only a loopback interface will be available. Tests can use this to start servers and ports opened will not be visible to the rest of the system, so clashes are impossible. Tests therefore can't access external network resources.
-
Filesystem: each test gets its own in-memory filesystem mounted on
/tmp
, so they can't accidentally overwrite each other's files. The temporary directory is also mounted under here to prevent tests from walking up to parent directories to find files they haven't declared. - It will also be in a new IPC namespace so it can't use SysV IPC to communicate with other processes.
In general the networking separation is the most useful since it means tests
do not have to worry about finding unused ports or handling clashes.
These features require having unprivileged user namespace support in the
kernel, which should usually be the case from version 3.10 onwards (which is
the vast majority of systems today).
On other platforms, we distribute the same binary alongside Please to simplify configuration, but it currently has no effect.
If you want to explore what the sandbox is doing, you can use
plz tool sandbox bash
to get a shell within it;
you'll observe that commands like ping
and
curl
no longer work.