oikonomiyaki
oikonomiyaki

Reputation: 7951

Why does insertInto fail when working with tables in non-default database?

I'm using Spark 1.4.0 (PySpark). I have a DataFrame loaded from Hive table using this query:

sqlContext = HiveContext(sc)
table1_contents = sqlContext.sql("SELECT * FROM my_db.table1")

When I attempt to insert data from table1_contents after some transformations, into table2 using DataFrameWriter#insertInto function:

sqlContext.createDataFrame(transformed_data_from_table1).write.insertInto('my_db.table2')

I encounter this error:

py4j.protocol.Py4JJavaError: An error occurred while calling o364.insertInto.
: org.apache.spark.sql.AnalysisException: no such table my_db.table2;

I know my table is existing because when I type:

print sqlContext.tableNames('my_db')

table1 and table2 are displayed. Can anyone help about this issue?

Upvotes: 0

Views: 6511

Answers (4)

YingLong Sun
YingLong Sun

Reputation: 11

Hi,I don't know if you had solved the problem. In my work I got the similar issue and I solved it. My spark version is 1.40, and so I think there is no bug in the program @Ton Torres. The problem is you used the sqlContext instead of hiveContext. When you need to operate hive you had better use hiveContext to create DataFrame like this

    val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
    dfresult = hiveContext.createDataFrame(temp,structType)
    hiveContext.sql("use default")
    dtResult.write.insertInto("tablename")

May you good luck

Upvotes: 1

Ton Torres
Ton Torres

Reputation: 1519

This is a reported bug. Apparently, the issue is resolved only in the upcoming version 1.6.0.

As a workaround you can do what you said, or go with the default database as mentioned by @guoxian. You could also try out the version 1.6.0-SNAPSHOT.

EDIT: The JIRA issue I linked is for Spark Scala version, so I can't say if this issue is fixed in PySpark v 1.6.0. Sorry for the confusion.

Upvotes: 1

guoxian
guoxian

Reputation: 31

I had similar issue. It looks like the insertInto function might have some bug when writing to non-default database. After I change the target table to default database, it works fine.

Upvotes: 3

oikonomiyaki
oikonomiyaki

Reputation: 7951

I was not able to make

sqlContext.createDataFrame(transformed_data_from_table1).write.insertInto('my_db.table2')

working, however, it seems SparkSQL supports INSERT statements as string.

sqlContext.sql("INSERT INTO TABLE my_db.table2...");

and this one works.

Although I still look forward to the time that my original question will be answered and working (hopefully on future version of Spark, if this is a bug).

Upvotes: 0

Related Questions