Reputation: 3324
I have three Zeppelin (0.6) paragraphs:
para1:
val hc = new org.apache.spark.sql.hive.HiveContext(sc)
val df = hc.sql("SELECT * FROM tweetsORC")
z.put("wds", df)
para2:
import org.apache.spark.sql.DataFrame
import sqlContext.implicits._
import org.apache.spark.sql.functions._
val df = z.get("wds").asInstanceOf[DataFrame]
df.select(explode($"filtered").as("value")).groupBy("value").count().sort(desc("count")).show(20, false)
df.registerTempTable("top20")
para3:
%sql
select * from top20
this gives the following error:
Table not found: top20
I assume this is because the table is part of the hivecontext and sql cannot see it. I have seen some solutions to similar problems that suggest creating a sqlcontext is the problem, but I have not done this. So how can the %sql paragraph access the temp table? Any pointers are greatly appreciated. (I want to use %sql for the nice built-in graphs).
Upvotes: 0
Views: 1695
Reputation: 36
Interoperability between interpreters is provided only when you use the contexts provided for you by the Zeppelin (as sqlContext
). Once you create your own context here:
val hc = new org.apache.spark.sql.hive.HiveContext(sc)
it is not connected in any way to the context used by %sql
and Table not found
is the expected error.
Solution: use sqlContext
to create and register tables.
Upvotes: 2