Reputation: 101
I'm using pyspark (1.6.1) running in local mode. I have a dataframe from csv file, and I need to add dense_rank() column. I understood that sqlContext doesn't support window function, but HiveContext does.
hiveContext = HiveContext(sc)
df.registerTempTable("visits")
visit_number = hiveContext.sql("select store_number, "
"dense_rank() over(partition by store_number order by visit_date) visit_number "
"from visits")
I'm getting the error: AnalysisException: u'Table not found: visits;
after the warning: WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
After reading previous questions, I've tried to change the ConnectionURL in conf/hive_defaults.xml to be the exact location of the hive directory, with no success.
Anyone on this issue?
Thanks!
Upvotes: 2
Views: 2550
Reputation: 101
Result: Deleting SQLContext and working only with HiveContext, everything works fine.
Upvotes: 1
Reputation: 3108
you should create DataFrame before registerTempTable
MyDataFrame <- read.df(sqlContext, CsvPath, source = "somthing.csv", header = "true")
and after that:
registerTempTable(MyDataFrame, "visits")
Upvotes: 0