Yoshi
Yoshi

Reputation: 415

Is the order of `load` in Bazel's specification or just because of the implementation?

I faced a problem on building Go project with Bazel and found that the root cause of it was the order of load to import @io_bazel_rules_go.

After receiving the answer, I referred to Bazel's official docs if it is defined in the spec or just implicit dependency to the implementation. I couldn't have checked all official documentations yet, but it sounds the following docs are relevant to this question and still it's vague how the order of load affects builds; in the case I experieced, it seems the earlier declaration wins over the later.

Can anyone clarify if this is spec or not?

Upvotes: 2

Views: 1492

Answers (1)

Jay Conrod
Jay Conrod

Reputation: 29701

The documentation on overriding external repositories is vague, but https://docs.bazel.build/versions/2.2.0/external.html is probably the best reference.

My understanding is that:

  • Repositories declared in WORKSPACE or in functions called by WORKSPACE are evaluated lazily. This is important because evaluating repository rules can be very expensive (they tend to download large files) and may not be necessary for a lot of builds.
  • It's not an error to declare a repository more than once with the same name. If a repository rule has not been evaluated, the last declaration will be used.
  • A repository rule is evaluated when any label within the repository is resolved. This includes:
    • Loading a .bzl file with a load statement.
    • Passing a label in an attribute to another repository rule which is evaluated.
    • Using ctx.path on a label in another repository rule which is evaluated.
    • (After WORKSPACE is fully resolved) building a target that loads something or depends on something in a repository.
  • After a repository rule is evaluated, it can no longer be overridden. Later declarations of repositories with the same name are silently ignored.

This logic is confusing, and it can be hard to understand what version of something is used when a rule is declared multiple times.

  • Ideally, each rule should be declared once in WORKSPACE to minimize confusion. Of course, that may be difficult if your dependencies provide functions to declare transitive dependencies. You may end up manually inlining those functions in some cases.
  • Dependency functions like go_rules_dependencies should avoid overriding anything that's been declared using a _maybe function like this:
def _maybe(repo_rule, name, **kwargs):
    if name not in native.existing_rules():
        repo_rule(name = name, **kwargs)
  • Arrange your WORKSPACE file to ensure direct dependencies are declared and possibly resolved earlier. This may make it harder to group related declarations, but the evaluation semantics will be clearer.
    • http_archive and git_repository repositories for rule sets and direct dependencies.
    • load statements for repository rules and dependency functions.
    • Other direct dependencies (e.g., go_repository).
    • Calls to dependency functions for transitive dependencies.
    • Toolchain registration.

For debugging, the native.existing_rules function can be quite handy. It returns a list of all repositories declared so far, along with the attributes they were declared with. Define a function that calls it and prints the result, then call that function from anywhere in WORKSPACE.

Upvotes: 6

Related Questions