Reputation: 93
I have a Spark Cluster in Docker with 1 Master and 2 Workers. On every worker an Apache Ignite Software running.
If I open spark-shell and execute the following command( where I open the cache store some value and read out the data from the cache):
import org.apache.ignite.spark._
import org.apache.ignite.configuration._
val ic = new IgniteContext(sc, () => new IgniteConfiguration())
val sharedRDD: IgniteRDD[Integer, Integer] = ic.fromCache[Integer, Integer]("partitioned")
sharedRDD.savePairs(sc.parallelize(1 to 100, 10).map(i => (i, i)))
sharedRDD.count
then I receive:
res3: Long = 100
If I execute sharedRDD.collect().foreach(println)
, there are every number pairs in the list unil 100
(1,1)
(2,2)
(3,3)
(4,4)
(5,5)
(6,6)
(7,7)
(8,8)
(9,9)
(10,10)
...
(100,100)
It's perfect.
BUT When I quit with sys.exit and reopen the spark-shell again, and execute the following code (where I read out the data from the cache):
import org.apache.ignite.spark._
import org.apache.ignite.configuration._
val ic = new IgniteContext(sc, () => new IgniteConfiguration())
val sharedRDD: IgniteRDD[Integer, Integer] = ic.fromCache[Integer, Integer]("partitioned")
sharedRDD.count
sharedRDD.collect().foreach(println)
Then the result is
res0: Long = 60
and some number pairs are missing. (for example 4,9,10)
(1,1)
(2,2)
(3,3)
(5,5)
(6,6)
(7,7)
(8,8)
(11,11)
(12,12)
(13,13)
(14,14)
(15,15)
...
Has anybody an idea why this happen?
Upvotes: 1
Views: 198
Reputation: 8390
There is an issue that causes nodes embedded into executors started in server mode [1], most likely this is the reason. As a workaround you can force IgniteContext
to start everything in client mode:
val ic = new IgniteContext(sc, () => new IgniteConfiguration().setClientMode(true))
Of course, this assumes that you start in a standalone mode with a separately running Ignite cluster.
[1] https://issues.apache.org/jira/browse/IGNITE-5981
Upvotes: 3