Abhishek Shenoy
Abhishek Shenoy

Reputation: 95

Spark Sql 1.6 Key Not Found when there is a case mismatch for field names in DataFrame and in Hive Table

Consider there's a hive table that is partitioned

create table hivetbl(a int,b int) partitioned by(c int);

Now if we try to insert into the table through Spark DataFrame

Seq((1, 2, 3)).toDF("A", "B","C").write.partitionBy("C").insertInto("hivetbl");

It throws

Caused by: java.util.NoSuchElementException: key not found: c

Whereas, if I change the structure of the dataFrame to

Seq((1, 2, 3)).toDF("a", "b", "c").write.partitionBy("c").insertInto("hivetbl");

Data gets loaded into the table.

Shouldn't spark handle this case mismatch that's happening between the DataFrame and hive table as hive is not case sensitive ?

Upvotes: 0

Views: 725

Answers (2)

eliasah
eliasah

Reputation: 40370

Actually there is lots of discussions about case sensitivity, but since spark 1.5 (if I'm not mistaken) this is configurable.

You can change the Spark SQL configuration for case sensitivity using:

sqlContext.sql("set spark.sql.caseSensitive=false")

And the reason why it should be like is that SQLContext deals with many types of data sources, and in some where case sensitivity sounds logical in other cases, it doesn't.

Upvotes: 1

Idan Fischman
Idan Fischman

Reputation: 57

I think that the Error is happening inside Spark. Spark is case-sensitive, so in his own interpreter you have to take care of that, even if it is irrelevant to other applications in your hadoop system.

Upvotes: 0

Related Questions