Parquet file in Spark SQL

Question

I am trying to use Spark SQL using parquet file formats. When I try the basic example :

object parquet {

  case class Person(name: String, age: Int)

  def main(args: Array[String]) {

    val sparkConf = new SparkConf().setMaster("local").setAppName("HdfsWordCount")
    val sc = new SparkContext(sparkConf)
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    // createSchemaRDD is used to implicitly convert an RDD to a SchemaRDD.
    import sqlContext.createSchemaRDD

    val people = sc.textFile("C:/Users/pravesh.jain/Desktop/people/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))
    people.saveAsParquetFile("C:/Users/pravesh.jain/Desktop/people/people.parquet")

    val parquetFile = sqlContext.parquetFile("C:/Users/pravesh.jain/Desktop/people/people.parquet")
  }
}

I get a null pointer exception :

Exception in thread "main" java.lang.NullPointerException at org.apache.spark.parquet$.main(parquet.scala:16)

which is the line saveAsParquetFile. What's the issue here?

Pravesh Jain · Accepted Answer

This error occurs when I was using Spark in eclipse in Windows. I tried the same on spark-shell and it works fine. I guess spark might not be 100% compatible with windows.

Parquet file in Spark SQL

Answers (2)

Related Questions