Jason
Jason

Reputation: 421

Using Hive in a maven project

I have a project that I am migrating from ant to maven. The project makes use of a lightly-customized Hive build. I figured I would just import this build into our internal maven repo and list it as a dependency in the project's pom file. The problem I'm running into is that the Hive build just generates a bunch of jars in build/dist/lib. Some of these are the core Hive jars themselves and some are jars that Hive depends on. What's the best way to deal with these? Should I put all the core hive jars into our internal repo and just deal with undocumented dependencies in the new project's pom file? Or just jar up everything as a jar of jars and deploy that to the repo? Would that approach even work? Kind of a maven newbie still, thanks for any help.

Upvotes: 0

Views: 898

Answers (1)

Zac Thompson
Zac Thompson

Reputation: 12665

You should create a POM for your modified Hive build, and deploy it to your internal artifact repo along with the jar. This POM should specify any dependencies (i.e., those other jars). If some of those are also custom versions, you should create POMs for those as well, otherwise just use the standard public groupId/artifactId. This is the Maven way. Note that you don't necessarily need to use the POM for building Hive, just during deployment.

Why you should do this:

  • If you don't specify the dependencies correctly, you might run into issues when someone forgets to include the full set of dependencies in their project, or specifies the wrong version for one of them
  • If you create a jar of jars, you might run into issues when someone tries to use the custom Hive "uber jar" as well as a different version of one of those dependencies at the same time. You'll end up with multiple versions of the overlapping classes in the classpath.

The best thing for Maven is always if you tell it everything that is going on. Don't try to tell it what you think it wants to hear.

Upvotes: 1

Related Questions