Reputation: 1101
I'm trying to print/take elements of a particular partition. On this question, I've found an elegant way to do it in Scala using this code:
distData.mapPartitionsWithIndex( (index: Int, it: Iterator[Int]) =>it.toList.map(x => if (index ==5) {println(x)}).iterator).collect
I'm struggling in converting this to Python, can you someone help me out here.
P.S: Also, unlike the above solution, I just want to take first 5 elements of the partition, instead of printing it all.
Upvotes: 0
Views: 1387
Reputation:
You can:
from itertools import islice
rdd.mapPartitions(lambda it: islice(it, 0, 5))
or
rdd.mapPartitionsWithIndex(lambda i, it: islice(it, 0, 5) if i == x else [])
Upvotes: 1