Reputation: 51
We are using latest versions of Hive as well as Impala. Impala is being authenticated with LDAP and authorization is being done via Sentry. Hive access is not authorized via Sentry as yet. We are creating tables from Impala while the /user/hive/warehouse has group level ownership by "hive" group, hence, the folder permissions are impala:hive.
drwxrwx--T - impala hive 0 2015-08-24 21:16 /user/hive/warehouse/test1.db
drwxrwx--T - impala hive 0 2015-08-11 17:12 /user/hive/warehouse/test1.db/events_test_venus
As can be seen, above folders are owned by Impala and group is Hive, and are group-writable. The group “hive” has a user named “hive” as well:
[root@server ~]# groups hive
hive : hive impala data
[root@server ~]# grep hive /etc/group
hive:x:486:impala,hive,flasun,testuser,fastlane
But when I try to query the table created on the folder, it gives access errors:
[root@jupiter fastlane]# sudo -u hive hive
hive> select * from test1.events_test limit 1;
FAILED: SemanticException Unable to determine if hdfs://mycluster/user/hive/warehouse/test1.db/events_test_venus is encrypted: org.apache.hadoop.security.AccessControlException: Permission denied: user=hive, access=EXECUTE, inode="/user/hive/warehouse/test1.db":impala:hive:drwxrwx--T
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6599)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6581)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6506)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:9141)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1582)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEZForPath(AuthorizationProviderProxyClientProtocol.java:926)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1343)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
Any ideas how to counter it?? Basically, we are trying to exploit the fact that by giving the group level read and write permissions, we should be able to make any group user to create and use the tables created by the folder owner, but that does not seem to be possible. Is it because of the fact that Impala alone has the Sentry authorization which uses the user impersonalization while Hive, stand-alone doesn't?
Can someone please guide or confirm?
Thanks
Upvotes: 0
Views: 4093
Reputation: 2536
You can set the umask of hdfs to 000 and restart the cluster. This will ensure that all the directories or files created after this change will be with permissions 777. After this apply proper ownership and permissions to the directories and folders to ensure that the permissions of other directories are not open. Setting the umask to 000 will not change the permissions of existing directories. Only the newly created directories/files will be affected. If you are using cloudera manager, it is very easy to make this change.
NB: Umask 000 will make all the files/directories with default permission 777. This will make open permissions. So handle this by applying permissions and acls at the parent directory level.
Upvotes: 1