Reputation: 7951
I'm using Spark 1.4.0 (PySpark). I have a DataFrame loaded from Hive table using this query:
sqlContext = HiveContext(sc)
table1_contents = sqlContext.sql("SELECT * FROM my_db.table1")
When I attempt to insert data from table1_contents
after some transformations, into table2 using DataFrameWriter#insertInto function:
sqlContext.createDataFrame(transformed_data_from_table1).write.insertInto('my_db.table2')
I encounter this error:
py4j.protocol.Py4JJavaError: An error occurred while calling o364.insertInto.
: org.apache.spark.sql.AnalysisException: no such table my_db.table2;
I know my table is existing because when I type:
print sqlContext.tableNames('my_db')
table1 and table2 are displayed. Can anyone help about this issue?
Upvotes: 0
Views: 6511
Reputation: 11
Hi,I don't know if you had solved the problem. In my work I got the similar issue and I solved it. My spark version is 1.40, and so I think there is no bug in the program @Ton Torres. The problem is you used the sqlContext instead of hiveContext. When you need to operate hive you had better use hiveContext to create DataFrame like this
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
dfresult = hiveContext.createDataFrame(temp,structType)
hiveContext.sql("use default")
dtResult.write.insertInto("tablename")
May you good luck
Upvotes: 1
Reputation: 1519
This is a reported bug. Apparently, the issue is resolved only in the upcoming version 1.6.0.
As a workaround you can do what you said, or go with the default database as mentioned by @guoxian. You could also try out the version 1.6.0-SNAPSHOT.
EDIT: The JIRA issue I linked is for Spark Scala version, so I can't say if this issue is fixed in PySpark v 1.6.0. Sorry for the confusion.
Upvotes: 1
Reputation: 31
I had similar issue. It looks like the insertInto function might have some bug when writing to non-default database. After I change the target table to default database, it works fine.
Upvotes: 3
Reputation: 7951
I was not able to make
sqlContext.createDataFrame(transformed_data_from_table1).write.insertInto('my_db.table2')
working, however, it seems SparkSQL supports INSERT
statements as string.
sqlContext.sql("INSERT INTO TABLE my_db.table2...");
and this one works.
Although I still look forward to the time that my original question will be answered and working (hopefully on future version of Spark, if this is a bug).
Upvotes: 0