Asier Gomez
Asier Gomez

Reputation: 6588

How to save Data in HDFS from Spark specifying a user

I would like to save a file in HDFS from Spark, I just try using the next line:

df.write.format("com.databricks.spark.csv").save(s"hdfs://hdp.asier.es:8020/assetgroup/$index/1-20170131")

But it throws the next error:

Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=agomez, access=WRITE, inode="/assetgroup/1/1-20170131/_temporary/0":hdfs:hdfs:drwxr-xr-x

It is evident that the problem is because it tries to connect using the user: agomez, how can I configure to use another user with the adequate permission?

Upvotes: 1

Views: 3113

Answers (2)

Asier Gomez
Asier Gomez

Reputation: 6588

I solved defining the Hadoop username in an environment variable:

HADOOP_USER_NAME=sparkload

Upvotes: 2

Sahil Desai
Sahil Desai

Reputation: 3696

You need to change the access privileges on the HDFS directory /assetgroup, after logging in as the user hdfs, from the command line:

hdfs dfs –chmod –R 755 /assetgroup

or you can give the permission tou your user

hadoop fs -chown -R user:agomez  /assetgroup

Upvotes: 0

Related Questions