Reputation: 65560
I am using Apache Drill 1.10 on OSX. I have an S3 bucket with about 150 files in, and I have set up a connection between the two as per the Drill documentation.
I can see all the files in the bucket, from Drill:
jdbc:drill:zk=local> USE `s3`.`root`;
+-------+--------------------------------------+
| ok | summary |
+-------+--------------------------------------+
| true | Default schema changed to [s3.root] |
+-------+--------------------------------------+
1 row selected (1.123 seconds)
jdbc:drill:zk=local> SHOW files;
<shows table of 15 files>
And I can query individual files successfully:
jdbc:drill:zk=local> select * from s3.`ocds-b5fd17-00ec7c92-54f3-4c50-8214-8b8c0cf9ff09-140281-qc54303.json`;
<returns results>
But how do I query across the whole directory? On the local filesystem I'd just supply a wildcard like dfs
./path/*.json
, but that doesn't seem to work with S3:
jdbc:drill:zk=local> select * from s3.`*.json`;
Error: VALIDATION ERROR: Can not create a Path from an empty string
SQL Query null
Upvotes: 1
Views: 1847
Reputation: 512
The following query will run over the root directory:
select * from s3.`.`;
Upvotes: 3
Reputation: 65560
I figured this out, recording it here for the benefit of anyone else with the same problem.
Create a directory in the root of the bucket, called e.g. releases
, and put the files in there.
Then you can just query all files in the directory with:
select * from s3.`releases`;
Upvotes: -1