Reputation: 1173
I tried googling, but couldn't find an answer.
Taken from Apache Spark: map vs mapPartitions?
What's the difference between an RDD's map and mapPartitions
map works the function being utilized at a per element level while mapPartitions exercises the function at the partition level.
In this context, what is element level? Is it just an individual row?
Upvotes: 2
Views: 189
Reputation: 29155
In layman's terms you have a shelf with 10 racks and you have 100 balls like shown in picture. You will adjust 10 balls in 1 rack like wise.. 100 balls in 10 racks. is balldata.repartition(10)
... thus uniformly distributed data(rather putting all 100 in one or 2 rack )
Now instead of applying any logic on each ball (element or row), you are going to apply logic on each rack (partition) once. is the difference.
In this case element is ball (a single row) and Partition is rack.
I advise you to go through the examples given there to understand better
courtesy/credits for image here
Upvotes: 3