Christoph Walesch
Christoph Walesch

Reputation: 2427

How can I Extract a File From a TAR containing TGZs using Gradle?

I have a tar file that contains multiple tar.gz files (a Docker image), and I want to extract a single file from it. Doing this is in a shell script is trivial, but it seems a bit tricky when using Gradle.

This is what I have so far:

task extractOuter(type: Copy) {
    dependsOn jibBuildTar
    from tarTree(file("${buildDir}/jib-image.tar"))
    include "*.tar.gz"
    into "${buildDir}/tgz"
}

task extractInner(type: Copy) {
    dependsOn extractOuter
    from (fileTree(dir: "${buildDir}/tgz").collect { tarTree(it) }) {
        include "**/filename"
        includeEmptyDirs = false
    }
    into "${buildDir}/files"
}

It seemed to work at first, but it turned out that it fails occasionally: the extractInner task does not find the file. Maybe I don't use Gradle's lazy evaluation correctlty.

How to make it work? Or is there totally different, more elegant way?

Upvotes: 0

Views: 1160

Answers (1)

Cisco
Cisco

Reputation: 22952

Doing this is in a shell script is trivial

You can continue using the shell script by using the Exec task type.

but it seems a bit tricky when using Gradle.

What you have so far is how you do it with Gradle. The advantage with Gradle is that it won't perform work that has already happened. See Build Cache for more details.

It seemed to work at first, but it turned out that it fails occasionally: the extractInner task does not find the file. Maybe I don't use Gradle's lazy evaluation correctlty.

This is called out in the above linked docs (emphasis mine):

Some tasks, like Copy or Jar, usually do not make sense to make cacheable because Gradle is only copying files from one location to another. It also doesn’t make sense to make tasks cacheable that do not produce outputs or have no task actions.

So you've declared your tasks, but you haven't configured them to produce any outputs which may or may not contribute to the problem since you expect the output to be present for a task dependency.

Since Copy extends DefaultTask, you can use the outputs to set the task output.

task extractOuter(type: Copy) {
    dependsOn jibBuildTar
    outputs.dir(file("$buildDir/tgz")
    from tarTree(file("${buildDir}/jib-image.tar"))
    include "*.tar.gz"
    into "${buildDir}/tgz"
}

task extractInner(type: Copy) {
    dependsOn extractOuter
    outputs.dir(file("$buildDir/files")
    from (fileTree(dir: "${buildDir}/tgz").collect { tarTree(it) }) {
        include "**/filename"
        includeEmptyDirs = false
    }
    into "${buildDir}/files"
}

Upvotes: 1

Related Questions