KayV
KayV

Reputation: 13855

How to resolve error: value reduceByKey is not a member of org.apache.spark.rdd.RDD[(Int, Int)]?

I am learning apache spark and trying to execute a small program on scala terminal.

I have started the dfs, yarn and history server using the following commands:

start-dfs.sh
start-yarn.sh
mr-jobhistory-deamon.sh start historyserver

and then in the scala terminal, i have written the following commands:

 var file = sc.textFile("/Users/****/Documents/backups/h/*****/input/ncdc/micro-tab/sample.txt");
 val records = lines.map(_.split("\t"));
 val filters = records.filter(rec => (rec(1) != "9999" && rec(2).matches("[01459]")));
 val tuples = filters.map(rec => (rec(0).toInt, rec(1).toInt)); 
 val maxTemps = tuples.reduceByKey((a,b) => Math.max(a,b));

all commands are executed successfully, except the last one, which throws the following error:

error: value reduceByKey is not a member of org.apache.spark.rdd.RDD[(Int, Int)]

i found some solutions like:

This comes from using a pair rdd function generically. The reduceByKey method is actually a method of the PairRDDFunctions class, which has an implicit conversion from RDD.So it requires several implicit typeclasses. Normally when working with simple concrete types, those are already in scope. But you should be able to amend your method to also require those same implicit.

But i am not sure how to achieve this.

Any help, how to resolve this issue?

Upvotes: 0

Views: 3924

Answers (1)

Gilad Ber
Gilad Ber

Reputation: 126

It seems you are missing an import. Try writing this in the console:

import org.apache.spark.SparkContext._

And then running the above commands. This import brings an implicit conversion which lets you use the reduceByKey method.

Upvotes: 2

Related Questions