Reputation: 173
I am new to spark. I am trying to process and RDD by sending each element of an RDD to the executors for further processing.
I am creating an RDD in driver code as below:
ArrayList<String> test = new ArrayList<String>();
test.add("conf1");
test.add("conf12");
JavaRDD<String> result = sc.parallelize(test);
I am not sure how to process this so that I can process both conf1 and conf12 simultaneously at executor. Have tried flatmap and map but it did not work.
What will be the best way to do so? Thanks in Advance
Upvotes: 0
Views: 367
Reputation: 1398
You have two elements in your collections. Most likely you end up with two partitions. You can verify that calling
result.partitions();
What do you mean map or flatMap doesn't work? Probably you need to add action to your transformations. Spark doesn't evaluate your transformations until you call for action.
for example
result.map(x -> x + " processed").collect();
Upvotes: 1