Mohamed Shakeel
Mohamed Shakeel

Reputation: 360

Exception in spark java

I am reading a directory of text files from my local machine in spark. I am getting the following exception when I run it using spark-submit

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/03/30 01:15:22 INFO SparkContext: Running Spark version 2.1.0
17/03/30 01:15:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/30 01:15:23 WARN Utils: Your hostname, Inspiron-N4050 resolves to a loopback address: 127.0.1.1; using 192.168.43.249 instead (on interface wlp9s0)
17/03/30 01:15:23 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
17/03/30 01:15:23 INFO SecurityManager: Changing view acls to: shakeel
17/03/30 01:15:23 INFO SecurityManager: Changing modify acls to: shakeel
17/03/30 01:15:23 INFO SecurityManager: Changing view acls groups to: 
17/03/30 01:15:23 INFO SecurityManager: Changing modify acls groups to: 
17/03/30 01:15:23 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(shakeel); groups with view permissions: Set(); users  with modify permissions: Set(shakeel); groups with modify permissions: Set()
17/03/30 01:15:23 INFO Utils: Successfully started service 'sparkDriver' on port 35160.
17/03/30 01:15:23 INFO SparkEnv: Registering MapOutputTracker
17/03/30 01:15:23 INFO SparkEnv: Registering BlockManagerMaster
17/03/30 01:15:23 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/03/30 01:15:23 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/03/30 01:15:23 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-ea876e3a-fd03-47df-b492-b6deccffe77d
17/03/30 01:15:23 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
17/03/30 01:15:23 INFO SparkEnv: Registering OutputCommitCoordinator
17/03/30 01:15:24 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/03/30 01:15:24 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.43.249:4040
17/03/30 01:15:24 INFO SparkContext: Added JAR file:/home/shakeel/workspace/geneselection/target/geneselection-0.0.1-SNAPSHOT.jar at spark://192.168.43.249:35160/jars/geneselection-0.0.1-SNAPSHOT.jar with timestamp 1490816724265
17/03/30 01:15:24 INFO Executor: Starting executor ID driver on host localhost
17/03/30 01:15:24 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 40585.
17/03/30 01:15:24 INFO NettyBlockTransferService: Server created on 192.168.43.249:40585
17/03/30 01:15:24 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/03/30 01:15:24 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.43.249, 40585, None)
17/03/30 01:15:24 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.43.249:40585 with 366.3 MB RAM, BlockManagerId(driver, 192.168.43.249, 40585, None)
17/03/30 01:15:24 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.43.249, 40585, None)
17/03/30 01:15:24 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.43.249, 40585, None)
Exception in thread "main" java.lang.ExceptionInInitializerError
    at org.apache.spark.SparkContext.withScope(SparkContext.scala:701)
    at org.apache.spark.SparkContext.wholeTextFiles(SparkContext.scala:858)
    at org.apache.spark.api.java.JavaSparkContext.wholeTextFiles(JavaSparkContext.scala:224)
    at geneselection.AttributeSelector.run(AttributeSelector.java:229)
    at geneselection.AttributeSelector.main(AttributeSelector.java:213)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.7.5
    at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
    at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
    at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:730)
    at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
    at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
    ... 14 more
17/03/30 01:15:24 INFO SparkContext: Invoking stop() from shutdown hook
17/03/30 01:15:24 INFO SparkUI: Stopped Spark web UI at http://192.168.43.249:4040
17/03/30 01:15:24 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/03/30 01:15:24 INFO MemoryStore: MemoryStore cleared
17/03/30 01:15:24 INFO BlockManager: BlockManager stopped
17/03/30 01:15:24 INFO BlockManagerMaster: BlockManagerMaster stopped
17/03/30 01:15:24 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/03/30 01:15:24 INFO SparkContext: Successfully stopped SparkContext
17/03/30 01:15:24 INFO ShutdownHookManager: Shutdown hook called
17/03/30 01:15:24 INFO ShutdownHookManager: Deleting directory /tmp/spark-966721ae-388b-476b-972e-8e108c1454d9

I have no idea why this occurs. There are some csv files in a directory in my computer. The code the produces this exception is

public void run(String path){
        String master = "local[*]";
        SparkConf conf = new SparkConf().setAppName(AttributeSelector.class.getName()).setMaster(master);
        JavaSparkContext context = new JavaSparkContext(conf);
        try {

            context.wholeTextFiles("/home/shakeel/Parts/");
        } catch (Exception e) {
            e.printStackTrace();
        }
        System.out.println("Loaded files");
        context.close();
    }

I want to read the csv files and perform feature selection on each file and store the result from each file in a queue for further processing. Why am I getting this exception?

I tried running a sample word count application the same way and it works perfectly. Does this have something to do with the fact that the files are not plain text files but are csv files?

Any help is appreciated

Upvotes: 0

Views: 650

Answers (1)

Paul Back
Paul Back

Reputation: 1319

You're hitting Jackson version collisions. In order to see where the incompatible version is coming from, run the following from the top level directory of your maven project (YOUR SCALA VERSION will be either 2.10 or 2.11).

mvn dependency:tree -Dverbose -Dincludes=com.fasterxml.jackson.module

Then once you've found the dependency causing the issue, put this into your pom inside the dependency markers of the artifact in question.

<exclusions>
    <exclusion>
      <groupId>com.fasterxml.jackson.module</groupId>
      <artifactId>jackson-module-scala_(YOUR SCALA VERSION)</artifactId>
    </exclusion>
</exclusions> 

Upvotes: 1

Related Questions