Reputation: 95
Consider there's a hive table that is partitioned
create table hivetbl(a int,b int) partitioned by(c int);
Now if we try to insert into the table through Spark DataFrame
Seq((1, 2, 3)).toDF("A", "B","C").write.partitionBy("C").insertInto("hivetbl");
It throws
Caused by: java.util.NoSuchElementException: key not found: c
Whereas, if I change the structure of the dataFrame to
Seq((1, 2, 3)).toDF("a", "b", "c").write.partitionBy("c").insertInto("hivetbl");
Data gets loaded into the table.
Shouldn't spark handle this case mismatch that's happening between the DataFrame and hive table as hive is not case sensitive ?
Upvotes: 0
Views: 725
Reputation: 40370
Actually there is lots of discussions about case sensitivity, but since spark 1.5 (if I'm not mistaken) this is configurable.
You can change the Spark SQL configuration for case sensitivity using:
sqlContext.sql("set spark.sql.caseSensitive=false")
And the reason why it should be like is that SQLContext
deals with many types of data sources, and in some where case sensitivity sounds logical in other cases, it doesn't.
Upvotes: 1
Reputation: 57
I think that the Error is happening inside Spark. Spark is case-sensitive, so in his own interpreter you have to take care of that, even if it is irrelevant to other applications in your hadoop system.
Upvotes: 0