Hello lad
Hello lad

Reputation: 18790

Permission denied when starting spark Command line on AWS EMR cluster

I have launched a cluster with 2 machines (1 master, 1 core) on AWS EMR service with 1 keypairs.

then logged into master instance with ssh provided the created .pem

successed!

then I try to run spark-shell or pyspark on master instance and get the following error

Error initializing SparkContext.
org.apache.hadoop.security.AccessControlException: Permission denied:   user=ec2-user, access=WRITE, inode="/user":hdfs:hadoop:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6512)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6494)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6446)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4248)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4218)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:635)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

Upvotes: 9

Views: 6180

Answers (2)

James k
James k

Reputation: 342

To solve this issue you don't have to always ssh as the hadoop user. The shell is trying to access the current users home directory on HDFS.

Running the following terminal commands as the hadoop user (e.g. with su) then allowed me to use spark-shell as my normal user

hdfs dfs -mkdir /user/myuser
hdfs dfs -chown myuser:hadoop /user/myuser

(Replace myuser with the user you want to run the shell as)

Upvotes: 1

Hello lad
Hello lad

Reputation: 18790

solved by myself.

ssh with ec2-user would success in logging in, but cause permission error when starting spark

ssh with user hadoop solve this problem

Upvotes: 20

Related Questions