Douglas Parker
Douglas Parker

Reputation: 1204

Building LLVM with Bazel

I've got a project currently using CMake, which I would like to switch over to Bazel. The primary dependency is LLVM, which I use to generate LLVM IR. Looking around, there doesn't seem to be a whole lot of guidance on this as only TensorFlow seems to use LLVM from Bazel (and auto-generates its config as far as I can tell). There was also a thread on bazel-discuss I found which discussed a similar issue, though my attempts to replicate it have failed.

Currently, my best run has got to be this (fetcher.bzl):

def _impl(ctx):
    # Download LLVM master
    ctx.download_and_extract(url = "https://github.com/llvm-mirror/llvm/archive/master.zip")

    # Run `cmake llvm-master` to generate configuration.
    ctx.execute(["cmake", "llvm-master"])

    # The bazel-discuss thread says to delete llvm-master, but I've
    # found that only generated files are pulled out of master, so all
    # the non-generated ones get dropped if I delete this.
    # ctx.execute(["rm", "-r", "llvm-master"])

    # Generate a BUILD file for the LLVM dependency.
    ctx.file('BUILD', """
# Build a library with all the LLVM code in it.
cc_library(
    name = "lib",
    srcs = glob(["**/*.cpp"]),
    hdrs = glob(["**/*.h"]),

    # Include the x86 target and all include files.
    # Add those under llvm-master/... as well because only built files
    # seem to appear under include/...
    copts = [
        "-Ilib/Target/X86",
        "-Iinclude",
        "-Illvm-master/lib/Target/X86",
        "-Illvm-master/include",
    ],

    # Include here as well, not sure whether this or copts is
    # actually doing the work.
    includes = [
        "include",
        "llvm-master/include",
    ],
    visibility = ["//visibility:public"],
    # Currently picking up some gtest targets, I have that dependency
    # already, so just link it here until I filter those out.
    deps = [
        "@gtest//:gtest_main",
    ],
)
""")

    # Generate an empty workspace file
    ctx.file('WORKSPACE', '')

get_llvm = repository_rule(implementation = _impl)

And then my WORKSPACE file looks like the following:

load(":fetcher.bzl", "get_llvm")

git_repository(
    name = "gflags",
    commit = "46f73f88b18aee341538c0dfc22b1710a6abedef", # 2.2.1
    remote = "https://github.com/gflags/gflags.git",
)

new_http_archive(
    name = "gtest",
    url = "https://github.com/google/googletest/archive/release-1.8.0.zip",
    sha256 = "f3ed3b58511efd272eb074a3a6d6fb79d7c2e6a0e374323d1e6bcbcc1ef141bf",
    build_file = "gtest.BUILD",
    strip_prefix = "googletest-release-1.8.0",
)

get_llvm(name = "llvm")

I would then run this with bazel build @llvm//:lib --verbose_failures.

I would consistently get errors from missing header files. Eventually I found that running cmake llvm-master generated many header files into the current directory, but seemed to leave the non-generated ones in llvm-master/. I added the same include directories under llvm-master/ and that seems to catch a lot of the files. However, currently it seems that tblgen is not running and I am still missing critical headers required for the compilation. My current error is:

In file included from external/llvm/llvm-master/include/llvm/CodeGen/MachineOperand.h:18:0,
                 from external/llvm/llvm-master/include/llvm/CodeGen/MachineInstr.h:24,
                 from external/llvm/llvm-master/include/llvm/CodeGen/MachineBasicBlock.h:22,
                 from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:20,
                 from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/ConstantFoldingMIRBuilder.h:13,
                 from external/llvm/llvm-master/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp:10:
external/llvm/llvm-master/include/llvm/IR/Intrinsics.h:42:38: fatal error: llvm/IR/IntrinsicEnums.inc: No such file or directory

Attempting to find this file in particular, I don't see any IntrinsicEnums.inc, IntrinsicEnums.h, or IntrinsicEnums.dt. I do see a lot of Instrinsics*.td, so maybe one of them generates this particular file?

It seems like tblgen is supposed to convert the *.td files to *.h and *.cpp files (please correct me if I am misunderstanding). However, this doesn't seem to be running. I saw that in Tensorflow's project, they have a gentbl() BUILD macro, though it is not practical for me to copy it as it has way too many dependencies on the rest of Tensorflow's build infrastructure.

Is there any way to do this without something as big and complex as Tensorflow's system?

Upvotes: 2

Views: 2718

Answers (1)

Douglas Parker
Douglas Parker

Reputation: 1204

I posted to the llvm-dev mailing list here and got some interesting responses. LLVM definitely wasn't designed to support Bazel and doesn't do so particularly well. It appears to be theoretically possible by using Ninja to output all the compile commands and then consume them from Bazel. This is likely to be pretty difficult and would require a separate tool which outputs Skylark code to be run by Bazel.

This seemed pretty complex for the scale of project I was working on, so my workaround was to download the pre-built binaries from releases.llvm.org. This included all the necessary headers, libraries, and tooling binaries. I was able to make a simple but powerful toolchain based around this in Bazel for my custom programming language.

Simple example (limited but focused): https://github.com/dgp1130/llvm-bazel-foolang

Full example (more complex and less focused): https://github.com/dgp1130/sanity-lang

Upvotes: 2

Related Questions