user3318421
user3318421

Reputation: 101

hive context doesn't recognize temp table in pyspark - AnalysisException: 'Table not found'

I'm using pyspark (1.6.1) running in local mode. I have a dataframe from csv file, and I need to add dense_rank() column. I understood that sqlContext doesn't support window function, but HiveContext does.

hiveContext = HiveContext(sc)
df.registerTempTable("visits")
visit_number = hiveContext.sql("select store_number, "
                               "dense_rank() over(partition by store_number order by visit_date) visit_number "
                               "from visits")

I'm getting the error: AnalysisException: u'Table not found: visits;

after the warning: WARN ObjectStore: Failed to get database default, returning NoSuchObjectException

After reading previous questions, I've tried to change the ConnectionURL in conf/hive_defaults.xml to be the exact location of the hive directory, with no success.

Anyone on this issue?

Thanks!

Upvotes: 2

Views: 2550

Answers (2)

user3318421
user3318421

Reputation: 101

Result: Deleting SQLContext and working only with HiveContext, everything works fine.

Upvotes: 1

Adam Silenko
Adam Silenko

Reputation: 3108

you should create DataFrame before registerTempTable

MyDataFrame <- read.df(sqlContext, CsvPath, source = "somthing.csv", header = "true")

and after that:

registerTempTable(MyDataFrame, "visits")

Upvotes: 0

Related Questions