drobert
drobert

Reputation: 1310

sbt always recompiles full project in CI, even with caching?

I'm struggling to use SBT for a CI process with this basic workflow:

  1. compile tests
  2. cache ~/.sbt and ~/.ivy2/cache
  3. cache all target directories in my project

In a subsequent step:

  1. restore ~/.sbt and ~/.ivy2/cache
  2. restore full project, including previously-generated target directories with contained .class files and identical source code (it should be the same checkout)
  3. run tests via sbt test

100% of the time, sbt test recompiles the full project. I'd like to understand or debug why that's the case, given nothing has changed since the last compilation (well, nothing should have changed, so what's causing it to believe something has?)

I'm currently using circleci with a docker executor. This means there is a new docker instance, from the same image, running each step, though I would expect caching to address this.

Relevant sections of .circleci/config.yml (if you don't use circle, this should still be grok-able; I've annotated what I can):

---
version: 2

jobs:
  # compile and cache compilation
  test-compile:
    working_directory: /home/circleci/myteam/myproj
    docker:
      - image: myorg/myimage:sbt-1.2.8
    steps:
      # the directory to be persisted (cached/restored) to the next step
      - attach_workspace:
          at: /home/circleci/myteam
      # git pull to /home/circleci/myteam/myproj
      - checkout
      - restore_cache:
          # look for a pre-existing set of ~/.ivy2/cache, ~/.sbt dirs 
          # from a prior build
          keys:
            - sbt-artifacts-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}
      - restore_cache:
          # look for pre-existing set of 'target' dirs from a prior build
          keys:
            - build-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}
      - run:
          # the compile step
          working_directory: /home/circleci/myteam/myproj
          command: sbt test:compile
      # per: https://www.scala-sbt.org/1.0/docs/Travis-CI-with-sbt.html
      # Cleanup the cached directories to avoid unnecessary cache updates
      - run:
          working_directory: /home/circleci
          command: |
            rm -rf /home/circleci/.ivy2/.sbt.ivy.lock
            find /home/circleci/.ivy2/cache -name "ivydata-*.properties" -print -delete
            find /home/circleci/.sbt -name "*.lock" -print -delete
      - save_cache:
          # cache ~/.ivy2/cache and ~/.sbt for subsequent builds
          key: sbt-artifacts-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}-{{ .Revision }}
          paths:
            - /home/circleci/.ivy2/cache
            - /home/circleci/.sbt
      - save_cache:
          # cache the `target` dirs for subsequenet builds
          key: build-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}-{{ .Revision }}
          paths:
            - /home/circleci/myteam/myproj/target
            - /home/circleci/myteam/myproj/project/target
            - /home/circleci/myteam/myproj/project/project/target
      # in circle, a 'workflow' undergoes several jobs, this first one 
      # is 'compile', the next will run the tests (see next 'job' section
      # 'test-run' below). 
      # 'persist to workspace' takes any files from this job and ensures 
      # they 'come with' the workspace to the next job in the workflow
      - persist_to_workspace:
          root: /home/circleci/myteam
          # bring the git checkout, including all target dirs
          paths:
            - myproj
      - persist_to_workspace:
          root: /home/circleci
          # bring the big stuff
          paths:
            - .ivy2/cache
            - .sbt

  # actually runs the tests compiled in the previous job
  test-run:
    environment:
      SBT_OPTS: -XX:+UseConcMarkSweepGC -XX:+UnlockDiagnosticVMOptions  -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -Duser.timezone=Etc/UTC -Duser.language=en -Duser.country=US
    docker:
      # run tests in the same image as before, but technically 
      # a different instance
      - image: myorg/myimage:sbt-1.2.8
    steps:
      # bring over all files 'persist_to_workspace' in the last job
      - attach_workspace:
          at: /home/circleci/myteam
      # restore ~/.sbt and ~/.ivy2/cache via `mv` from the workspace 
      # back to the home dir
      - run:
          working_directory: /home/circleci/myteam
          command: |
            [[ ! -d /home/circleci/.ivy2 ]] && mkdir /home/circleci/.ivy2

            for d in .ivy2/cache .sbt; do
              [[ -d "/home/circleci/$d" ]] && rm -rf "/home/circleci/$d"
              if [ -d "$d"  ]; then
                mv -v "$d" "/home/circleci/$d"
              else
                echo "$d does not exist" >&2
                ls -la . >&2
                exit 1
              fi
            done
      - run:
          # run the tests, already compiled
          # note: recompiles everything every time!
          working_directory: /home/circleci/myteam/myproj
          command: sbt test
          no_output_timeout: 3900s

workflows:
  version: 2
  build-and-test:
    jobs:
      - test-compile
      - test-run:
          requires:
            - test-compile

Output from the second phase typically looks like:

#!/bin/bash -eo pipefail
sbt test

[info] Loading settings for project myproj-build from native-packager.sbt,plugins.sbt ...
[info] Loading project definition from /home/circleci/myorg/myproj/project
[info] Updating ProjectRef(uri("file:/home/circleci/myorg/myproj/project/"), "myproj-build")...
[info] Done updating.
[warn] There may be incompatibilities among your library dependencies; run 'evicted' to see detailed eviction warnings.
[info] Compiling 1 Scala source to /home/circleci/myorg/myproj/project/target/scala-2.12/sbt-1.0/classes ...
[info] Done compiling.
[info] Loading settings for project root from build.sbt ...
[info] Set current project to Piranha (in build file:/home/circleci/myorg/myproj/)
[info] Compiling 1026 Scala sources to /home/circleci/myorg/myproj/target/scala-2.12/classes ...

What can I do to determine why this is re-compiling all sources this second time and alleviate it?

I'm running sbt 1.2.8 with scala 2.12.8 in a linux container.


Update

I haven't solved the problem but I figured I'd share a workaround for the worst of my problem.

Primary problem: separate 'test compile' with 'test run' Secondary problem: faster builds without having to recompile everything on every push

I have no solution to the secondary. For the primary:

I can run the scalatest runner from the CLI via scala -cp ... org.scalatest.tools.Runner rather than via sbt test to avoid any attempt at recompilation. The runner can operate against a directory of .class files.

Summary of changes:

  1. Update the docker container to include a scala cli install. (Unfortunate as I now need to keep these versions in sync)
  2. build phase: sbt test:compile 'inspect run' 'export test:fullClasspath' | tee >(grep -F '.jar' > ~test-classpath.txt)
    • compiles but also records a copy-patseable classpath string, suitable for passing into scala -cp VALUE_HERE to run tests
  3. test phase: scala -cp "$(cat test-classpath.txt)" org.scalatest.tools.Runner -R target/scala-2.12/test-classes/ -u target/test-reports -oD
    • runs scalatest via the runner, using compiled .class files in target/scala-2.12/test-classes, using the classpath reported on in the compile phase, and printint to stdout as well as a reports directory

I don't love this and it has some problems, but figured I'd share this workaround.

Upvotes: 13

Views: 2908

Answers (5)

steinybot
steinybot

Reputation: 6134

I have the same problem. I gave up trying to get all the timestamps to match up and eventually found that I could use:

sbt 'set  Compile / compile / skip := true' 'test'

It still isn't perfect, sourceGenerators and probably some other things may still run but it is certainly a lot better than without it.

Upvotes: 0

Andrius Versockas
Andrius Versockas

Reputation: 46

If you are using a newer sbt version than 1.0.4 the caching won't work for you as the compiler will always invalidate everything. This zinc compiler issue has already been reported here: https://github.com/sbt/sbt/issues/4168

My suggestion would be to downgrade sbt version for CI. Also to check and validate if CI is changing .sbt or .ivy2 file timestamps. If they are changed, cache them separately by zipping and unzipping them.

I had the same issue for Bitbucket Pipelines CI and managed to successfully make it work here

Upvotes: 1

Pieter Bos
Pieter Bos

Reputation: 1574

I ran into a similar issue with a travis build, and I suspect this solution will work for circle-ci as well. The root cause was that the cache is stored as a tar file, for which the modify time of files only have a second resolution. You can specify a format that has sufficient resolution. The solution for me was to create a small script travis_tar.sh:

#!/bin/bash
/bin/tar-orig --format=posix $@

And then replace the system tar with this script:

sudo mv /bin/tar /bin/tar-orig
sudo mv .travis/travis_tar.sh /bin/tar
sudo chmod +x /bin/tar

This can happen after the cache is loaded, the vanilla system tar unpacks the posix format tar file just fine.

Upvotes: 0

Rich
Rich

Reputation: 15457

SBT is very finicky about recompiling, and Docker gives it particular trouble.

Take a look at:

Upvotes: 0

wbertelsen
wbertelsen

Reputation: 1

I'm also running into this with sbt 1.2.8 in a gitlab job. Previously (in with sbt 0.13) caching the target directories worked fine.

Right now I'm trying to debug manually by setting:

logLevel := Level.Debug,
incOptions := incOptions.value.withApiDebug(true).withRelationsDebug(true),

in my builds. This should print the reasons for invalidation. It produces way too much output to run in CI though, so I'm having trouble reproducing the exact conditions where I'm seeing the problem.

Upvotes: 0

Related Questions