Reputation: 53
I am trying to create hive tables as outputs of my spark (1.5.1 version) job on a hadoop cluster (BigInsight 4.1 distribution) and am facing permission issues. My guess is spark is using a default user (in this case 'yarn' and not the job submitter's username) to create the tables and therefore fails to do so.
I tried to customize the hive-site.xml file to set an authenticated user that has permissions to create hive tables, but that didn't work.
I also tried to set Hadoop user variable to an authenticated user but it didn't work either.
I want to avoid saving txt files and then creating hive tables to optimize performances and reduce the size of the outputs through orc compression.
My questions are :
Thanks. Hatak!
Upvotes: 1
Views: 1500
Reputation: 151
Consider df
holding your data, you can write
In Java:
df.write().saveAsTable("tableName");
You can use different SaveMode like Overwrite, Append
df.write().mode(SaveMode.Append).saveAsTable("tableName");
In Scala:
df.write.mode(SaveMode.Append).saveAsTable(tableName)
A lot of other options can be specified depending on what type you would like to save. Txt, ORC (with buckets), JSON.
Upvotes: 0