Third-party dependencies
Sooner or later, most projects end up needing a dependency on some third-party libraries, and one of the jobs of the build system is to manage those as well. Please is no exception to this, but it bears a little discussion since most systems handle this differently.
In Please, third-party dependencies can be created in any BUILD file and manipulated as any other build rule. We encourage methods of fetching them that are repeatable; typically each language has one that matches up to a common package manager, for example:
- Python
- Java
- Go
- C/C++
pip_library(
name = "my_library",
version = "1.0.0",
)
See pip_library() for more information
maven_jar(
name = "my_library",
id = "me:mylibrary:1.0.0",
)
See maven_jar() for more information
go_module(
name = "my_library",
module = "github.com/me/my_library",
version = "v1.0.0",
)
See go_module() for more information
Unfortunately the C/C++ ecosystem lacks a defacto standard third-party repository. Thankfully, the Please build language is powerful and can reliably build nearly any part of your project. See the writing build rules documentation for more information on this.
Each of these require explicit declarations of all their dependencies in the
BUILD file; this is how we pin dependencies & guarantee
reproducibility.
There are one or two alternatives that show slightly different approaches
(e.g.
python_wheel
which is more standalone, and
remote_file
which is a general tool to download anything (although often more work is
required to actually build it)
The typical idiom we use is to place BUILD files under a third_party
directory, to make it clear where they're coming from. Commonly we separate
them by language for multi-language repos as well.
See
third_party/go,
third_party/python
and
third_party/java
in Please's repo for some examples of what these look like.
There's no explicit command to download third-party dependencies (e.g.
plz fetch
or similar). Dependencies are built as
part of the build process along with everything else, so their downloads can
be parallelised with compiling other targets.
Subrepos
Please also supports a concept called "subrepos" which allows fetching arbitrary dependencies and attaching build rules to them. These can be depended on from other build rules and generally used as normal.
Subrepos are defined using builtins like http_archive or github_repo. These download a remote file and extract it, and make the contents available to other rules. In most cases you can choose to attach a BUILD file to them, but it can also use an existing one if appropriate.
For example (as seen in the Please repo):
github_repo(
name = "gtest",
bazel_compat = True,
repo = "google/googletest",
revision = "release-1.8.1",
)
Rules within subrepos can be referenced using a triple-slash prefix on rules, anywhere where a build rule would normally be accepted. For example:
cc_test(
name = "my_test",
...
deps = [
"///third_party/cc/gtest//:gtest_main",
],
)
Note that the subrepo label (third_party/cc/gtest
)
includes the package we defined it in earlier. In many ways subrepos mirror
the feature in Bazel, but in this case are more flexible since they're not
limited to being defined at the repo root. For compatibility, we also accept
an @
prefix for subrepos instead of
///
.
Comparison to other systems
For users familiar with Bazel, we expect that writing BUILD files won't be challenging, the main difference being that there is no direct equivalent to Bazel's WORKSPACE file. As mentioned above, third-party dependencies can occur wherever you choose to put them in normal BUILD files.
If you've used Buck before, the model is pretty similar to fetching Maven jars using the bucklet for it. This is not entirely coincidental since we were previously using Buck so initially Please had to mimic the same interface - but we also quite liked it and decided to keep on in the same way.
If you're coming from Gradle or Maven, it's a little more alien due to being
less language-specific and requiring full transitive dependencies to be
specified.
There is an
add-on rule
which is the closest equivalent; it works by communicating with Maven
repositories to find dependencies and generating more BUILD rules for them.
This can be a little unreliable though, since the Maven package format is
complex, and your dependencies aren't fully within your control and can
change between builds - we recommend
maven_jar
instead, but understand it's more work to set up.
requirements.txt
files from Python are not usually
especially difficult to translate using
pip_library; again
we require listing transitive dependencies explicitly, but this is not
normally too onerous for Python.
Since Please needs to know precisely what will be output, the rules can
sometimes need a little tweaking when the output names don't correspond to
the package names (or often a package outputs a single .py file instead of a
directory).
go_module works pretty
similarly to the usual go get
tool, but again only
outputs a single package at a time. Writing up the dependencies can be eased
by using something like
go list -f '{{.Deps}}' <package>
to discover
all the dependencies for the package in question.
Verification
An important concept of Please is strict validation of inputs and outputs of
each build. Third-party dependencies are an important case for this since
they allow code you don't totally control into your build.
Please has two explicit mechanisms for controlling this.
Hash verification
Please can natively verify hashes of packages. Some of the built-in rules for fetching things from third-party repos have this option, and you can add it to your own genrules. For example, one of the Python libraries we use:
pip_library(
name = "six",
version = "1.9.0",
outs = ["six.py"],
hashes = ["sha256: 0c31ab7cf1a2761efa32d9a7e891ddeadc0d8673"],
)
This declares that the calculated sha256 hash of the package must match one of the given set, and it's a failure if not.
You can find the output hash of a particular target by running
plz hash //third_party/python:six
which will
calculate it for you, and you can enter it in the BUILD file.
If it changes (for example when you update the version) plz can update the
BUILD file for you via
plz hash --update //third_party/python:six
.
The reason for allowing multiple hashes is for rules that generate different outputs on different architectures; this is common for Python libraries which have a compiled component, for example.
For testing purposes you can run Please with the
--nohash_verification
flag which will reduce
hash verification failures to a warning message only.
Note that when using this you must be careful that the outputs of your
rule are really deterministic. This is generally true for
remote_file
calls, but obviously only if the
server returns the same thing every time for that URL. Some care should be
taken with pip_library
since the outputs of a
pip install
for a package containing binary (not
pure Python) modules are not bit-for-bit identical if compiled locally,
only if you downloaded a precompiled wheel. Different Python and OS
versions can affect it too.
The sha256:
prefix is informative only and
indeed any string can occur before the colon. In future we may extend this
to allow specifying other hash types.
Licence validation
Please can attempt to autodetect licences from some third-party packages and inform you if they're not ones you'd accept. You mark licences in the .plzconfig file like so:
[licences]
accept = MIT
accept = BSD
reject = MS-EULA
By default, with no [licences]
section, Please
won't perform any licence checking.
Once you've added some any package with a licence must have a matching
accepted licence in the config.
Currently we can autodetect licences from
pip_library
and
maven_jars
rules, you can also set them manually
via the licences
attribute on a rule.
It bears mentioning that this is done as a best-effort - since licences
and their locations are not standardised in pip (and many other places) we
can't always be fully confident about how to match licence names and hence
don't try (for example, Apache 2
,
Apache-2.0
, and
The Apache Software License, version 2
all refer
to the same licence, despite being very different strings, whereas
LGPL
and AGPL
are
significantly different licences but only one letter apart).
Please also isn't a lawyer and can't provide advice about whether a specific licence is suitable for you or not. Only you can make that decision.