jtitusj
jtitusj

Reputation: 3086

Spark: Avoiding Namespace Conflict when building modified spark

I am building a custom spark into a jar file. And I want to use that while using the default spark build.

How do I change the namespace from org.apache.spark.allOfSpark into org.another.spark.allOfSpark without going through all files?

I want to do this in order to avoid conflict when importing modules. Thanks in advance.

Upvotes: 0

Views: 356

Answers (1)

Jonathan Taws
Jonathan Taws

Reputation: 1188

Depending on the build tool you are using, you could use Maven's relocation feature to move your custom spark into a new package at build-time. There are similar features in sbt and other build tools.

If you specify what you are using to build your project, I can further help on your issue.

-- UPDATE

Here is a sample code for your pom.xml that should help you getting started :

<project>
  <!-- Your project definition here, with the groupId, artifactId, and it's dependencies --> 
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-shade-plugin</artifactId>
        <version>2.4.3</version>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>shade</goal>
            </goals>
            <configuration>
              <relocations>
                <relocation>
                  <pattern>org.apache.spark</pattern>
                  <shadedPattern>shaded.org.apache.spark</shadedPattern>
                </relocation>
              </relocations>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>

</project>

This will effectively move all of Spark into a new package called shaded.org.apache.spark when you package your application (when you ask Maven to produce a jar).

If you need to exclude certain packages, you can use the <exclude> tag as shown in the link of Maven's relocation.

If what you are trying to achieve is simply to customize some parts of Spark, I would advise you to either fork Spark's code and directly rewrite parts of MLLib, and then build it only for you (or contribue it to the community if it can useful).

Or you could simply pull it as a dependency from Maven and just overwrite the classes you are modifying, Maven should then use your own class instead of the one in the original Spark package.

Upvotes: 1

Related Questions