Reputation: 327
I am new to spark development and trying to build my first spark2(scala) application using sbt in redhat linux environment. Below are the environment details.
CDH Version: 5.11.0
Apache Spark2: 2.1.0.cloudera1
Scala Version: 2.11.11
Java Version: 1.7.0_101
Application Code:
import org.apache.spark.sql
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.types._
import org.apache.spark.sql
object MySample {
def main(args: Array[String]) {
val warehouseLocation = "file:${system:user.dir}/spark-warehouse"
val spark = SparkSession
.builder()
.appName("FirstApplication")
.config("spark.sql.warehouse.dir", warehouseLocation)
.getOrCreate()
val schPer = new StructType(Array(
new StructField("Column1",IntegerType,false),
new StructField("Column2",StringType,true),
new StructField("Column3",StringType,true),
new StructField("Column4",IntegerType,true)
))
val dfPeriod = spark.read.format("csv").option("header",false).schema(schPer).load("/prakash/periodFiles/")
dfPeriod.write.format("csv").save("/prakash/output/dfPeriod")
}
}
Getting below error when compiling using sbt.
$ sbt
[info] Loading project definition from /home/prakash/project
[info] Set current project to my sample (in build file:/home/prakash/)
> compile
[info] Compiling 2 Scala sources to /home/prakash/target/scala-2.11/classes...
[error] /home/prakash/src/main/scala/my_sample.scala:2: object SparkSession is not a member of package org.apache.spark.sql
[error] import org.apache.spark.sql.SparkSession
[error] ^
[error] /home/prakash/src/main/scala/my_sample.scala:3: object types is not a member of package org.apache.spark.sql
[error] import org.apache.spark.sql.types._
[error] ^
[error] /home/prakash/src/main/scala/my_sample.scala:10: not found: value SparkSession
[error] val spark = SparkSession
[error] ^
[error] /home/prakash/src/main/scala/my_sample.scala:16: not found: type StructType
[error] val schPer = new StructType(Array(
[error] ^
..
..
..
[error] 43 errors found
[error] (compile:compileIncremental) Compilation failed
below are my sbt configuration for the project.
name := "my sample"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.1.0"
Upvotes: 1
Views: 3113
Reputation: 128071
SparkSession is a part of the spark-sql artifact, so you need this in your build config:
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.1.0"
Upvotes: 8