Reputation: 381
I have followed up documentation here Stratio JAVA spark mongdob project here but my code just gets stuck after printing Check1. I just cannot figure out what am I doing wrong.
JavaSparkContext sc = new JavaSparkContext("local[*]", "test spark-mongodb java");
SQLContext sqlContext = new SQLContext(sc);
Map options = new HashMap();
options.put("host", "host:port");
options.put("database", "database");
options.put("collection", "collectionName");
options.put("credentials", "username,database,password");
System.out.println("Check1");
DataFrame df = sqlContext.read().format("com.stratio.datasource.mongodb").options(options).load();
df.count();
df.show();
My pom file is as follows:
<dependencies>
<dependency>
<groupId>com.stratio.datasource</groupId>
<artifactId>spark-mongodb_2.11</artifactId>
<version>0.11.1</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>1.5.2</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>1.5.2</version>
<type>jar</type>
</dependency>
</dependencies>
I have checked the dependency tree and everything seems fine there.
Upvotes: 1
Views: 192
Reputation: 381
I am gonna answer my question in hope that it might help someone.
I finally managed to figure this out. The worst thing about these spark libraries is that there are huge differences between library version and many of them are incompatible with each other:
public class TestClass {
JavaSparkContext sc = new JavaSparkContext("local[*]", "test spark-mongodb java");
SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);
Map options = new HashMap();
options.put("host", "localhost:27017");
options.put("database", "test");
options.put("collection", "mycol");
DataFrame df = sqlContext.read().format("com.stratio.datasource.mongodb").options(options).load();
df.registerTempTable("mycol");
sqlContext.sql("SELECT * FROM mycol");
df.show();
}
and my pom file is as follows:
<dependency>
<groupId>com.stratio.datasource</groupId>
<artifactId>spark-mongodb_2.10</artifactId>
<version>0.11.2</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.1</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.6.1</version>
<type>jar</type>
</dependency>
Upvotes: 0
Reputation: 27692
What you are reporting is known bug which was solved at version 0.11.2.
This error was the reason why some internal Akka actors were not being terminated. That combined with the fact that the default Akka settings make the actor system non-daemonic prevented the application to close.
You have two options to fix your problem:
Include an application.conf
file under project resources
folder setting to actor system daemonic. That is, with the following content:
akka {
daemonic = on
}
Upvotes: 2