Reputation: 792
I have used so far this build.sbt in the local package directory
name := "spark27_02"
version := "1.0"
scalaVersion := "2.10.4"
sbtVersion := "0.13.7"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.2.1"
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.2.1"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.2.1"
libraryDependencies += "org.apache.hadoop" % "hadoop-hdfs" % "2.5.0"
i wanted to try out the 1.3.0 version that just came out, so i used the 1.3.0 versions of all the packages. Spark compiles, but SparkSQL does not, so I checked MavenCentral that suggests to use
libraryDependencies += "org.apache.spark" % "spark-sql_2.10" % "1.3.0"
but still not working. I do sbt update from the sbt shell. Btw using Scala 2.10.4
What silly thing am I doing wrong?
Any help is appreciated.
EDIT referring to the example on the spark webpage with this build.sbt
name := "Marzia2"
version := "1.0"
scalaVersion := "2.10.4"
sbtVersion := "0.13.7"
libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.3.0"
libraryDependencies += "org.apache.spark" % "spark-streaming_2.10" % "1.3.0"
libraryDependencies += "org.apache.spark" % "spark-sql_2.10" % "1.3.0"
doing
sbt package
i get
[info] Compiling 1 Scala source to /home/cloudera/IdeaProjects/Marzia2/target/scala-2.10/classes...
[error] /home/cloudera/IdeaProjects/Marzia2/src/main/scala/prova_sql.scala:35: value createSchemaRDD is not a member of org.apache.spark.sql.SQLContext
[error] import sqlContext.createSchemaRDD
[error] ^
[error] /home/cloudera/IdeaProjects/Marzia2/src/main/scala/prova_sql.scala:38: value registerTempTable is not a member of org.apache.spark.rdd.RDD[prova_sql.Person]
[error] people.registerTempTable("people")
[error] ^
[error] two errors found
[error] (compile:compile) Compilation failed
and in case i use the new features like implicits
in defining the spark context, i still get an error relating to it not being an error of the sparksql context.
There must be some stupid error somewhere.
Upvotes: 3
Views: 3705
Reputation: 67075
One part of the problem is that SchemaRDD
became a DataFrame
. Realistically, you should use
import sqlContext._
instead of the specific import as it will future proof you against implicit changes, but if you really want then you can use
import sqlContext.implicits
BUT, the second part is that 1.3.0 broke compatibility and is now locked in from an API perspective, so you now need to do the following:
rdd.toDF().registerTempTable("xyz")
Note the toDF
Now that the API is locked in, I cannot think of a way to add the more intuitive implicit
back in. You would end up with conflicting implicit
definitions for the case of import sqlContext._
and nested implicits are not supported in scala.
From the migration guide:
Additionally, the implicit conversions now only augment RDDs that are composed of Products (i.e., case classes or tuples) with a method toDF, instead of applying automatically.
Upvotes: 6