Reputation: 825
We're currently deploying our Flink applications as a fat-jar using the maven-shade-plugin. Problem is, each application jar ends up being approximately 130-140 MB which is a pain to build and deploy every time. Is there a way to exclude dependencies and just deploy a thin jar to the cluster which comes up to about 50 kB?
Upvotes: 2
Views: 3259
Reputation: 9102
Here's how we do it with Gradle!
We have two sub-projects:
job
: For the stream job that we want to run runtime
: For additional runtime dependencies (e.g. a custom FileSystem
implementation)We create a new gradle configuration for dependencies that are provided at runtime:
configurations {
provided,
compile.extendsFrom provided
}
and then mark the provided dependencies as:
provided("org.apache.flink:flink-java:1.6.0") // flink java v1.6.0
Then, we modify the jar
task to build a jar without any provided
dependencies:
jar {
dependsOn configurations.runtime
from {
(configurations.runtime - configurations.provided).collect {
it.isDirectory()? it : zipTree(it)
}
} {
exclude 'META-INF/*.RSA'
exclude 'META-INF/*.SF'
exclude 'META-INF/*.DSA'
}
manifest {
attributes 'Main-Class': 'com.example.Entrypoint'
}
}
The result is a jar
with required dependencies (compile
) bundled which we then deploy using the Web UI.
As for the custom runtime dependencies, we build a custom Docker image and push the built artifact (runtime.jar
, built using same configuration as above) to the libs/
directory in Flink. You can do that manually too if you are not using Docker.
And, lastly, in our particular case there is no direct dependency defined between our job and the runtime dependency (which is discovered using reflection).
Upvotes: 2
Reputation: 957
You can place the dependency JARs in the cluster beforehand in Flink's lib
(see Avoid Dynamic Classloading) and just upload the thin JAR on each job submission.
Upvotes: 3