Sachin Sukumaran
Sachin Sukumaran

Reputation: 715

How to get specific values from RDD in SPARK with PySpark

The following is my RDD, there are 5 fields

[('sachin', 200, 10,4,True), ('Raju', 400, 40,4,True), ('Mike', 100, 50,4,False) ]

Here I need to fetch 1st ,3rd and 5th Fields only , How to do in PySpark . Expected results as bellow . I tried reduceByKey in several ways, couldn't achieve it

Sachin,10,True
Raju,40,True
Mike,50,False

Upvotes: 0

Views: 3488

Answers (1)

phi
phi

Reputation: 11694

With a simple map?

rdd.map(lambda x: (x[0], x[2], x[4]))

Upvotes: 2

Related Questions