bcxuezhe39
bcxuezhe39

Reputation: 33

Apache Spark building on Amazon EC2 error with Spark-Maven-Plugin failure

I am currently building Apache Spark on Amazon EC2 linux VMs, following these instructions.

The tools I am using for the building:

apache-maven: 3.2.5;

scala: 2.10.4;

zinc: 0.3.5.3;

Java: jdk1.7.0_79

Linux 32bits

This error message is raised:

Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.0:testCompile (scala-test-compile-first) on project spark-core_2.10: Execution scala-test-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.0:testCompile failed. CompileFailed -> [Help 1]

[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. 
[ERROR] Re-run Maven using the -X switch to enable full debug logging. 
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles: 
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException

The website suggests that the error could be caused by a plugin failure, but provides no details. What is the problem? Is there an approach I can take to resolve the error?

Upvotes: 1

Views: 148

Answers (1)

Yayati Sule
Yayati Sule

Reputation: 1631

You can use the following pom.xml to build your project

<properties>
  <spark.version>2.3.2</spark.version>
  <scala.version>2.11.12</scala.version>
  <scala.compat.version>2.11</scala.compat.version>
</properties>

<dependencies>
  <dependency>
    <groupId>org.scala-lang</groupId>
    <artifactId>scala-library</artifactId>
    <version>${scala.version}</version>
    <scope>provided</scope>
  </dependency>
  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_${scala.compat.version}</artifactId>
    <version>${spark.version}</version>
    <scope>provided</scope>
  </dependency>
  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_${scala.compat.version}</artifactId>
    <version>${spark.version}</version>
    <scope>provided</scope>
  </dependency>
  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-hive_${scala.compat.version}</artifactId>
    <version>${spark.version}</version>
    <scope>provided</scope>
  </dependency>
</dependencies>
<build>
  <sourceDirectory>src/main/scala</sourceDirectory>
  <testSourceDirectory>src/test/scala</testSourceDirectory>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-assembly-plugin</artifactId>
      <configuration>
        <archive>
          <manifest>
            <mainClass>package.name.of.main.object</mainClass> <!-- add the path to file containing main method e.g com.company.code.ObjectName -->
          </manifest>
        </archive>
        <descriptorRefs>
          <descriptorRef>jar-with-dependencies</descriptorRef>
        </descriptorRefs>
      </configuration>
      <executions>
        <execution>
          <id>make-assembly</id>
          <phase>package</phase>
          <goals>
            <goal>single</goal>
          </goals>
        </execution>
      </executions>
    </plugin>

    <plugin>
      <groupId>net.alchim31.maven</groupId>
      <artifactId>scala-maven-plugin</artifactId>
      <version>3.1.0</version>
      <executions>
        <execution>
          <!-- <phase>compile</phase> -->
          <goals>
            <goal>compile</goal>
            <goal>testCompile</goal>
          </goals>
        </execution>
      </executions>
    </plugin>

    <plugin>
      <artifactId>maven-compiler-plugin</artifactId>
      <version>3.3</version>
      <configuration>
        <source>1.8</source>
        <target>1.8</target>
      </configuration>
    </plugin>
</build>

in the directory containing the pom file run the command: mvn clean install, and your project would be made available as an uber/fat jar in the target directory. You can then pass the JAR to spark-submit as usual.

Please keep the following points in mind:

  1. Spark and Scala do not support Java 1.7. If you wish to use the latest Spark 2.x series, you have to use Scala 2.11/2.12 in your dependencies.
  2. If you are using Spark 1.6, prefer using Scala 2.11, since the support for Scala 2.10, would not be so readily available.

Upvotes: 1

Related Questions