Reputation: 2221
I'm trying to explore Apache Drill. I'm not a Data Analyst, just an Infra support Guy. I see documentation on Apache Drill is too limited
I need some details about custom data storage that can be used with Apache Drill
Thanks in advance
Update:
My HDFS Storage defention says error (Invalid JSON mapping)
{
"type":"file",
"enabled":true,
"connection":"hdfs:///",
"workspaces":{
"root":{
"location":"/",
"writable":true,
"storageformat":"null"
}
}
}
If I replace hdfs:///
with file:///
, it seems to accept it.
I copied all the library files from the folder
<drill-path>/jars/3rdparty to <drill-path>/jars/
Cannot make it work. Please help. I'm not a dev at all, I'm Infra guy.
Thanks in advance
Upvotes: 1
Views: 1655
Reputation: 438
YES, it is possible that drill can communicate with both the Hadoop system and the RDBMS systems together. Infact you can have queries joining both the systems.
The HDFS storage plug in can be as :
{
"type": "file",
"enabled": true,
"connection": "hdfs://xxx.xxx.xxx.xxx:8020/",
"workspaces": {
"root": {
"location": "/user/cloudera",
"writable": true,
"defaultInputFormat": null
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null
}
},
"formats": {
"parquet": {
"type": "parquet"
},
"psv": {
"type": "text",
"extensions": [
"tbl"
],
"delimiter": "|"
},
"csv": {
"type": "text",
"extensions": [
"csv"
],
"delimiter": ","
},
"tsv": {
"type": "text",
"extensions": [
"tsv"
],
"delimiter": "\t"
},
"json": {
"type": "json"
}
}
}
The connection URL will be your mapR/Coudera URL with port number 8020 by default . You should be able to spot that in the configuration of Hadoop on your system with configuration key : "fs_defaultfs"
Upvotes: 0
Reputation: 1704
- Yes.
Drill directly recognizes the schema of the file based on the metadata. Refer the link for more info -
https://cwiki.apache.org/confluence/display/DRILL/Connecting+to+Data+Sources
- Not Yet.
While there is a MapR driver that lets you achieve the same but it is not inherently supported in Drill now. There have been several discussions around this and it might be there soon.
Upvotes: 1