Akshat Kumar
Akshat Kumar

Reputation: 69

Spark NullPointerException

My Spark code reads like this:-

val logData = sc.textFile("hdfs://localhost:9000/home/akshat/recipes/recipes/simplyrecipes/*/*/*/*")


def doSomething(line: String): (Long,Long) = { 


 val numAs = logData.filter(line => line.contains("a")).count();


 val numBs = logData.filter(line => line.contains("b")).count();
 return (numAs,numBs)

}

 val mapper = logData.map(doSomething _)

 val save = mapper.saveAsTextFile("hdfs://localhost:9000/home/akshat/output3")

mapper is of type org.apache.spark.rdd.RDD[(Long, Long)] = MappedRDD When I try to perform saveAsTextFile action, it gives an error java.lang.NullPointerException

What I am doing wrong and what changes should I do to rectify this exception?
Thanks in advance!

Upvotes: 1

Views: 1080

Answers (1)

David Griffin
David Griffin

Reputation: 13927

You should not reference logData from within doSomething. This is the issue. I can't tell exactly what you are trying to do, but if all you are trying to do is count the lines with "a" in them, you don't need to do the def, just do:

val numAs = logData.filter(line => line.contains("a")).count();
val numBs = logData.filter(line => line.contains("b")).count();

If on the other hand you are trying to count "a" and "b" in each line, and write out a line for every input, then try this:

def doSomething(line: String): (Int,Int) = {
  val numAs = line.count(ch => ch.equals("a"))
  val numBs = line.count(ch => ch.equals("b"))
  (numAs, numBs)
}

Upvotes: 5

Related Questions