Reputation: 587
My results are stored in Amazon S3 in parquet format.
My Requirements are as follows :
Options I looked into:
ListObjectsV2Request - can't use this yet because we have not upgraded to AWS Java SDK 2.0
Looking into S3 Select - Since S3 select needs the exact key of the contents I want to retrieve, first I will have to list all the parts from S3 and then use S3 Select on each part to get the results. Also I am not sure how I will paginate the input stream provided by S3
Also looking into Read parquet data from AWS s3 bucket but I am not clear on how to paginate the results.
Any input/help will be highly appreciated.
Upvotes: 1
Views: 1876
Reputation: 269360
This sounds like an excellent use-case for Amazon Athena. It can:
See:
Upvotes: 2