taransaini43
taransaini43

Reputation: 184

Permission Error | Apach Drill Query | HDFS

I am trying to query parquet files present on a hdfs cluster via Apache Drill(distributed mode). I have created a new storage plugin named 'hdfs' which contains the following configuration:

{
  "type": "file",
  "enabled": true,
  "connection": "hdfs://<my-name-node-host>:8020",
  "config": null,
  "workspaces": {
    "root": {
      "location": "/",
      "writable": true,
      "defaultInputFormat": null
    }
  },
  "formats": {
    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    },
    "parquet": {
      "type": "parquet"
    }
  }
}

In the hadoop fs I have the sample file region.parquet in /user/tj/ folder. It has the owner and group as hdfs:hdfs by default and i would like to keep it so.

But when i try to query it from Apache drill UI via following sql query : SELECT * FROM hdfs./user/tj/region.parquet It throws the exception as below :

org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: RemoteException: Permission denied: user=, access=EXECUTE, inode="/user/tj/region.parquet/.drill.parquet_metadata":hdfs:hdfs:-rw-r--r-- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:259) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:205) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827) at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:108) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3972) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1130) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:851) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) [Error Id: 24bf0cf0-0181-4c72-97ee-4b4eb98771bf on :31010]

How do I fix this permission issue to query the hadoop cluster files using apache drill ?

How to execute the query as a hdfs user?

Upvotes: 0

Views: 553

Answers (1)

charan tej
charan tej

Reputation: 1054

I think you have configure the user impersonation.You can follow the below link to give the view permission for the apache drill. I actually didn't use the apache drill so please update in comment if it works fine.

Configure user impersonation link

Upvotes: 0

Related Questions