Neeraj Gupta
Neeraj Gupta

Reputation: 173

How to process RDD on executor

I am new to spark. I am trying to process and RDD by sending each element of an RDD to the executors for further processing.

I am creating an RDD in driver code as below:

ArrayList<String> test = new ArrayList<String>();
test.add("conf1");
test.add("conf12");
JavaRDD<String> result = sc.parallelize(test);

I am not sure how to process this so that I can process both conf1 and conf12 simultaneously at executor. Have tried flatmap and map but it did not work.

What will be the best way to do so? Thanks in Advance

Upvotes: 0

Views: 367

Answers (1)

addmeaning
addmeaning

Reputation: 1398

You have two elements in your collections. Most likely you end up with two partitions. You can verify that calling

result.partitions();

What do you mean map or flatMap doesn't work? Probably you need to add action to your transformations. Spark doesn't evaluate your transformations until you call for action.

for example

result.map(x -> x + " processed").collect();

Upvotes: 1

Related Questions