Reputation: 2359
We are seeing duplicate builds of the same target in Bazel and wondering what could cause this.
Here is a sample output:
[52,715 / 55,135] 12 action running
Bazel package: some-pkg - Target: a_target - Generating files at bazel-out/host/bin/some-pkg/a_target_generate [for host]; 264s remote-cache, processwrapper-sandbox
Bazel package: some-pkg - Target: a_target - Generating files at bazel-out/k8-fastbuild/bin/some-pkg/a_target_generate; 264s remote-cache, processwrapper-sandbox
...
We have not been able to identify the issue. It looks like this is only happening on Linux but not on Macs.
The target a_target
is a custom_rule
target. It should be platform independent.
custom_rule = rule(
attrs = dict(
...
_custom_rule_java_binary = attr.label(
cfg = "host",
default = Label("//tools/bazel/build/rules/custom-rule:custom_rule_bin"),
executable = True,
),
_singlejar = attr.label(
cfg = "host",
default = Label("@bazel_tools//tools/jdk:singlejar"),
executable = True,
allow_files = True,
),
),
implementation = ...,
)
custom_rule_bin
is defined as follow:
java_library(
name = "custom_rule",
srcs = glob(["src/main/java/**/*.java"]),
deps = [
...,
],
)
java_binary(
name = "custom_rule_bin",
visibility = ["//visibility:public"],
main_class = "...",
runtime_deps = [
"@org_slf4j_simple",
":custom_rule",
"//some-pkg:some_pkg", # same some-pkg where a_target is built twice
],
)
The difference is that one says "for host" and the other doesn't. Anyone knows what the extra "for host" build is?
I do have a feeling that it's somehow related to the cfg
attribute on the custom rule. This is likely coming from some example code. We use the same value on all our rules which generate code. This custom rule is special because it requires code from the application being built by Bazel to run and generate additional code.
Any insights appreciated why host
would be wrong and what would be the correct value.
Any ideas/tips how to debug this?
Upvotes: 0
Views: 1640
Reputation: 5006
First, one note is that the host configuration is being mostly deprecated, and "exec" is usually preferred. Some info about that is here: https://bazel.build/rules/rules#configurations.
What's happening is that that target is being depended upon in multiple configurations, and so bazel will build that target in each configuration. You can use cquery to figure out what's going on
As a very simple example:
genrule(
name = "gen_bin",
outs = ["bin"],
srcs = [":gen_lib"],
exec_tools = [":gen_tool"],
cmd = "touch $@",
)
genrule(
name = "gen_tool",
outs = ["tool"],
srcs = [":gen_lib"],
cmd = "touch $@",
)
genrule(
name = "gen_lib",
outs = ["lib"],
cmd = "touch $@; sleep 10",
)
Building bin
, bazel runs the gen_lib genrule twice (in parallel):
$ bazel build bin
INFO: Analyzed target //:bin (5 packages loaded, 16 targets configured).
INFO: Found 1 target...
[1 / 5] 2 actions running
Executing genrule //:gen_lib; 1s linux-sandbox
Executing genrule //:gen_lib; 1s linux-sandbox
bazel config
gives the configurations that are currently in the in-memory build graph:
$ bazel config
Available configurations:
5b39bc31deb1f1d37f1f858e7eec3964394eacce5bede4456dd59d417af4a6e9 (exec)
723da02ae6d0c5577e98242c8f06ca1bd1c6d7b295c97345ac31b844bfe8f79c
8960923b9e7dc13418be101268efd8e57d80283213d18174705345598b699c6b
fd805cc1de357c04c7abac1b40bae600e3d9ee56a8d17af0c28b5031ca09bfb9 (host)
then cquery
:
$ bazel cquery "rdeps(//..., gen_lib)"
INFO: Analyzed 3 targets (0 packages loaded, 1 target configured).
INFO: Found 3 targets...
//:gen_lib (5b39bc3)
//:gen_lib (8960923)
//:gen_tool (5b39bc3)
//:gen_bin (8960923)
//:gen_tool (8960923)
INFO: Elapsed time: 0.052s
INFO: 0 processes.
INFO: Build completed successfully, 0 total actions
(cquery gives the first 7 digits of the configuration hash)
--output=graph
gives a dot graph which is a little more useful:
$ bazel cquery "rdeps(//..., gen_lib)" --output=graph > /tmp/graph
$ xdot /tmp/graph
So gen_bin is in the target configuration (8960923), and it depends on gen_lib, so gen_lib will also be built in the target configuration.
gen_bin also depends on gen_tool via the exec_tools attribute, and exec_tools builds everything in the exec configuration (5b39bc3).
gen_tool also depends on gen_lib, and since gen_tool is in the exec configuration, a version of gen_lib is built in the exec configuration.
(There's also another version of gen_tool in the target configuration in the output, and that's an artifact of using //...
in the "universe" argument of rdeps(), since //...
will capture every target. Similarly, doing bazel build //...
would cause gen_tool to be built twice.)
Upvotes: 1