How to resolve error: value reduceByKey is not a member of org.apache.spark.rdd.RDD[(Int, Int)]?

Question

I am learning apache spark and trying to execute a small program on scala terminal.

I have started the dfs, yarn and history server using the following commands:

start-dfs.sh
start-yarn.sh
mr-jobhistory-deamon.sh start historyserver

and then in the scala terminal, i have written the following commands:

 var file = sc.textFile("/Users/****/Documents/backups/h/*****/input/ncdc/micro-tab/sample.txt");
 val records = lines.map(_.split("	"));
 val filters = records.filter(rec => (rec(1) != "9999" && rec(2).matches("[01459]")));
 val tuples = filters.map(rec => (rec(0).toInt, rec(1).toInt)); 
 val maxTemps = tuples.reduceByKey((a,b) => Math.max(a,b));

all commands are executed successfully, except the last one, which throws the following error:

error: value reduceByKey is not a member of org.apache.spark.rdd.RDD[(Int, Int)]

i found some solutions like:

This comes from using a pair rdd function generically. The reduceByKey method is actually a method of the PairRDDFunctions class, which has an implicit conversion from RDD.So it requires several implicit typeclasses. Normally when working with simple concrete types, those are already in scope. But you should be able to amend your method to also require those same implicit.

But i am not sure how to achieve this.

Any help, how to resolve this issue?

How to resolve error: value reduceByKey is not a member of org.apache.spark.rdd.RDD[(Int, Int)]?

Answers (1)

Related Questions