How to save Data in HDFS from Spark specifying a user

Question

I would like to save a file in HDFS from Spark, I just try using the next line:

df.write.format("com.databricks.spark.csv").save(s"hdfs://hdp.asier.es:8020/assetgroup/$index/1-20170131")

But it throws the next error:

Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=agomez, access=WRITE, inode="/assetgroup/1/1-20170131/_temporary/0":hdfs:hdfs:drwxr-xr-x

It is evident that the problem is because it tries to connect using the user: agomez, how can I configure to use another user with the adequate permission?

Asier Gomez · Accepted Answer

I solved defining the Hadoop username in an environment variable:

HADOOP_USER_NAME=sparkload

How to save Data in HDFS from Spark specifying a user

Answers (2)

Related Questions