Hatak
Hatak

Reputation: 53

Create hive table through spark job

I am trying to create hive tables as outputs of my spark (1.5.1 version) job on a hadoop cluster (BigInsight 4.1 distribution) and am facing permission issues. My guess is spark is using a default user (in this case 'yarn' and not the job submitter's username) to create the tables and therefore fails to do so.

I tried to customize the hive-site.xml file to set an authenticated user that has permissions to create hive tables, but that didn't work.

I also tried to set Hadoop user variable to an authenticated user but it didn't work either.

I want to avoid saving txt files and then creating hive tables to optimize performances and reduce the size of the outputs through orc compression.

My questions are :

Thanks. Hatak!

Upvotes: 1

Views: 1500

Answers (1)

Vinay Limbare
Vinay Limbare

Reputation: 151

Consider df holding your data, you can write

In Java:

df.write().saveAsTable("tableName");

You can use different SaveMode like Overwrite, Append

df.write().mode(SaveMode.Append).saveAsTable("tableName");

In Scala:

df.write.mode(SaveMode.Append).saveAsTable(tableName)

A lot of other options can be specified depending on what type you would like to save. Txt, ORC (with buckets), JSON.

Upvotes: 0

Related Questions