Reputation:
So I saw this question on stackoverflow asked by another user and I tried to write the code myself as I am trying to practice scala and spark:
The question was to find the per-key average from a list:
Assuming the list is: ( (1,1), (1,3), (2,4), (2,3), (3,1) )
The code was:
val result = input.combineByKey(
(v) => (v, 1),
(acc: (Int, Int), v) => (acc._1 + v, acc._2 + 1),
(acc1: (Int, Int), acc2: (Int, Int)) => (acc1._1 + acc2._1, acc1._2 + acc2._2)).
map{ case (key, value) => (key, value._1 / value._2.toFloat) }
result.collectAsMap().map(println(_))
So basically the above code will create an RDD of type [Int, (Int, Int)]
where the first Int
is the key and the value is (Int, Int)
where the first Int
here is the addition of all the values with the same key and the second Int
is the amount of times the key appeared.
I understand what is going on but for some reason when I rewrite the code like this:
val result = input.combineByKey(
(v) => (v, 1),
(acc: (Int, Int), v) => (acc._1 + v, acc._2 + 1),
(acc1: (Int, Int), acc2: (Int, Int)) => (acc1._1 + acc2._1, acc1._2 + acc2._2)).
mapValues(value: (Int, Int) => (value._1 / value._2))
result.collectAsMap().map(println(_))
When I use mapValues instead of map with the case
keyword the code doesn't work.It gives an error saying error: not found: type /
What is the difference when using map with case and mapValues. Since I thought map values will just take the value (which in this case is a (Int,Int)) and return to you a new value and the key remains the same for the key value pair.
Upvotes: 0
Views: 1112
Reputation: 3890
try
val result = input.combineByKey(
(v) => (v, 1),
(acc: (Int, Int), v) => (acc._1 + v, acc._2 + 1),
(acc1: (Int, Int), acc2: (Int, Int)) => (acc1._1 + acc2._1, acc1._2 + acc2._2)).
mapValues(value => (value._1 / value._2))
result.collectAsMap().map(println(_))
Upvotes: 0
Reputation:
Never mind I found a good article to my problem: http://danielwestheide.com/blog/2012/12/12/the-neophytes-guide-to-scala-part-4-pattern-matching-anonymous-functions.html
If anyone else has the same problem that explains it well!
Upvotes: 0