The BUILD language

Please's BUILD files typically contain a series of build rule declarations. These are invocations of builtins like java_binary which create new BUILD targets.

However, you can do much more with it; it is a fully capable programming language with which it's possible to script the creation of build targets in elaborate ways. See below for a formal description of the grammar; it is a subset of Python so should be fairly familiar.

You can do most things that one might expect in such a language; for and if statements, define functions, create lists and dicts, etc. Conventionally we keep complex logic in build_defs files but at present there is no difference in accepted syntax between the two.

One obviously needs a mechanism to import new code; in Please that is subinclude. This function takes the output of a build rule elsewhere in the repo and makes it available in the context of the currently executing file - for example, if it has defined a function, that function is now available in your BUILD file at the top level.

See here for a full description of available builtin rules.

Types

The set of builtin types are again fairly familiar:

  • Integers (all integers are 64-bit signed integers)
  • Strings
  • Lists
  • Dictionaries
  • Functions
  • Booleans (named True and False)

There are no floating-point numbers or class types. In some cases lists and dicts can be "frozen" to prohibit modification when they may be shared between files; that's done implicitly by the runtime when appropriate.

Dictionaries are somewhat restricted in function; they may only be keyed by strings and cannot be iterated directly - i.e. one must use keys(), values() or items(). The results of all these functions are always consistently ordered.

Functions

The following functions are available as builtins:

  • len(x) - returns the length of x
  • enumerate(seq) - returns a list of pairs of the index and object for each item in seq
  • zip(x, y, ...) - returns a list in which each element has one item from each argument
  • isinstance(x, type) - returns True if x is of the given type.
  • range([start, ]stop[, step]) - returns a list of integers up to stop (or from start to stop)
  • any(seq) - returns true if any of the items in seq are considered true.
  • all(seq) - returns true if all of the items in seq are considered true.
  • sorted(seq) - returns a copy of the given list with the contents sorted.
  • package_name() - returns the package being currently parsed.
  • join_path(x, ...) - joins the given path elements using the OS separator. It will intelligently handle repeated or missing separators.
  • split_path(path) - splits the given path into the directory and filename.
  • splitext(filename) - splits the given filename into base name and extension at the final dot.
  • basename(path) - returns the basename of a file
  • dirname(path) - returns the directory name of a file.

The following are available as member functions of strings:

  • join(seq) - joins the elements of seq together with this string as a separator.
  • split(sep) - splits this string at each occurrence of the given separator.
  • replace(old, new) - returns a copy of this string with all instances of old replaced with new.
  • partition(sep) - breaks this string around the first occurrence of sep and returns a 3-tuple of (before, sep, after).
  • rpartition(sep) - breaks this string around the last occurrence of sep and returns a 3-tuple of (before, sep, after).
  • startswith(prefix) - returns true if this string begins with prefix
  • endswith(suffix) - returns true if this string ends with suffix
  • format(args...) - deprecated, prefer f-strings instead.
  • lstrip(cutset) - strips all characters in cutset from the beginning of this string.
  • rstrip(cutset) - strips all characters in cutset from the end of this string.
  • strip(cutset) - strips all characters in cutset from the beginning & end of this string.
  • find(needle) - returns the index of the first occurrence of needle in this string.
  • rfind(needle) - returns the index of the last occurrence of needle in this string.
  • count(needle) - returns the number of times needle occurs in this string.
  • upper() - returns a copy of this string converted to uppercase.
  • lower() - returns a copy of this string converted to lowercase.

The following are available as member functions of dictionaries:

  • get(key[, default]) - returns the item with the given key, or the default (None if that is not given).
  • setdefault(key[, default]) - If the given key is in the dict, return its value, otherwise insert it with the given value (None if that is not given).
  • keys() - returns an iterable sequence of the keys of this dictionary, in a consistent order.
  • values() - returns an iterable sequence of the values of this dictionary, in a consistent order.
  • items() - returns an iterable sequence of pairs of the keys and values of this dictionary, in a consistent order.
  • copy() - deprecated, use a comprehension if needed. Returns a shallow copy of this dictionary.

Finally, messages can be logged to Please's usual logging mechanism. These may or may not be displayed depending on the -v flag; by default only warning and above are visible.

  • log.debug(msg[, args...])
  • log.info(msg[, args...])
  • log.notice(msg[, args...])
  • log.warning(msg[, args...])
  • log.error(msg[, args...])
  • log.fatal(msg[, args...]) - this will cause the process to exit immediately and unsuccessfully.

Style

We normally write BUILD files in an idiom which doesn't quite match standard Python styles. The justification is that these are mostly just inherited from working on Blaze, but a brief explanation follows after an example:


      # Taken from //src/core/BUILD in the Please repo
      go_library(
          name = "core",
          srcs = glob(["*.go"], exclude=["*_test.go", "version.go"]) + [":version"],
          visibility = ["PUBLIC"],
          deps = [
              "//third_party/go:gcfg",
              "//third_party/go:logging",
              "//third_party/go:queue",
          ]
      )
    

All arguments to build rules are passed as keywords. This is pretty important since (1) nobody will be able to read your BUILD file otherwise and (2) we don't guarantee not to change the order of arguments when we insert new ones. Fortunately Please will check this for you at runtime.

Arguments to functions like glob() and subinclude() are not necessarily passed as keywords.

We put spaces around the = for each argument to the build rule - we think it's easier to read this way.

Either single or double quotes work, as usual, but don't mix both in one file. We usually prefer double because that's what Buildifier (see below) prefers.

Lists either go all on one line:

["*_test.go", "version.go"]
or are broken across multiple lines like so:
          [
              "//third_party/go:gcfg",
              "//third_party/go:logging",
              "//third_party/go:queue",
          ]

Indentation is normally four spaces. Tabs will be rejected by the parser.
Dealing with indentation in a whitespace-significant language is tricky enough without introducing tabs to complicate the situation as well.

We generally try to order lists lexicographically where it does not matter (for example deps or visibility).

If you'd like an autoformatter for BUILD files, Google's Buildifier is very good & fast. We use it both internally & on the Please repo.

Grammar

The grammar is defined as (more or less) the following in EBNF, where Ident, String, Int and EOL are token types emitted by the lexer.

# Start symbol for the grammar, representing the top-level structure of a file.
file_input = { statement };

# Any single statement. Must occur on its own line.
statement = ( "pass" | "continue" | func_def | for | if | return |
          raise | assert | ident_statement | literal ) EOL;
return = "return" [ expression { "," expression } ];
raise = "raise" expression;
assert = "assert" expression [ "," String ];
for = "for" Ident { "," Ident } "in" expression ":" EOL { statement };
if = "if" expression ":" EOL { statement }
     [ "elif" expression ":" EOL { statement } ]
     [ "else" ":" EOL { statement } ];
func_def = "def" Ident "(" [ argument { "," argument } ] ")" ":" EOL
           [ String EOL ]
           { statement };
argument = Ident [ ":" String { "|" String } ] { "&" Ident } [ "=" expression ];
ident_statement = Ident
                  ( { "," Ident } "=" expression
                  | ( "[" expression "]" ( "=" | "+=" ) expression)
                  | ( "." ident | call | "=" expression | "+=" expression ) );

# Any generalised expression, with all the trimmings.
expression = [ "-" | "not" ] value [ operator expression ]
             [ "if" expression "else" expression ];
string = [ "f" | "r" ] String;
value = ( string | Int | "True" | "False" | "None" | list | dict | parens | lambda | ident )
        [ slice ] [ ( "." ident | call ) ];
ident = Ident { "." ident | call };
call = "(" [ arg { "," arg } ] ")";
arg = expression | ident "=" expression;
list = "[" expression [ { "," expression } | comprehension ] "]";
parens = "(" expression { "," expression } ")";
dict = "{" expression ":" expression [ { "," expression ":" expression } | comprehension ] "}";
comprehension = "for" Ident { "," Ident } "in" expression
                [ "for" Ident { "," Ident } "in" expression ]
                [ "if" expression ];
slice = "[" [ expression ] [ ":" expression ] "]";
lambda = "lambda" [ lambda_arg { "," lambda_arg } ] ":" expression;
lambda_arg = Ident [ "=" expression ];
operator = ("+" | "-" | "%" | "<" | ">" | "and" | "or" | "is" |
            "in" | "not" "in" | "==" | "!=" | ">=" | "<=" );

As mentioned above, this is similar to Python but lacks the import, try, except, finally, class, global, nonlocal, while and async keywords. The implementation disallows using these as identifiers nonetheless since some tools might attempt to operate on the file using Python's ast module for convenience, which would not be possible if those keywords are used.
As a result, while raise and assert are supported, it's not possible to catch and handle the resulting exceptions. These hence function only to signal an error condition which results in immediate termination.
Note that assert is never optimised out, as it can be in Python.

A more limited set of operators than in Python are available. The provided set are considered sufficient for use in BUILD files.

Function annotations similar to PEP-3107 / PEP-484 are available, although they have first-class meaning as type hints. The arguments are annotated with the expected type or types (separated by |) and when called the type of the argument will be verified to match. This makes it easier to give useful feedback to users if they make mistakes in their BUILD files (e.g. passing a string where a list is required).

User-defined varargs and kwargs functions are not supported.

PEP-498 style "f-string" interpolation is available, but it is deliberately much more limited than in Python; it can only interpolate variable names rather than arbitrary expressions.