Reputation: 3
I have installed Hadoop/YARN in a linux VM on my local windows machine. On the same windows machine (not in VM) I have installed Spark. When running spark on windows, I can read files stored in HDFS (in linux VM).
val lines = sc.textFile("hdfs://MyIP:9000/Data/sample.txt")
While saving a file using to HDFS saveAsTextFile("hdfs://MyIP:9000/Data/Output")
, I am getting below error:
org.apache.hadoop.security.AccessControlException: Permission denied: user=LocalWindowsUser, access=WRITE, inode="/Data":hadoop:supergroup:drwxr-xr-x.
I guess, it's because Windows and Linux users are different and windows user doesn't have permission to write files in linux.
What is the correct way to store files from windows to HDFS (linux VM) using spark?
Upvotes: 0
Views: 448
Reputation: 81454
Your problem is that the username that you are using to access HDFS with write mode does not have permissions.
The directory /Data
has the permissions rwxr-xr-x
. This translates to mode 755. Your username is LocalWindowsUser
which has read and execute permissions.
Possible solutions:
Soution 1:
Since this is a local system under your full control, change the permissions to allow everyone access. Execute this command while inside the VM as the user hadoop
:
hdfs dfs -chmod -R 777 /Data
Solution 2: Create an environment variable in Windows and set the username:
set HADOOP_USER_NAME=hadoop
The username really should be the user hdfs
. Try that also if necessary.
Upvotes: 1