Reputation: 11315
I have an Athena/PrestoDB query that returns up to 300 million device ids. This screen shot shows the query when executed in the AWS UI. The results were displayed in under 1 minute and I downloaded the full results (319MB) in a few minutes from the link provided in the UI.
When I execute the same query over the JDBC connection I'm receiving a method not implemented error. It appears the AthenaJDBC41-1.0.0.jar from the AWS docs has not implemented the getCharacterStream yet.
ActiveRecord::StatementInvalid: ActiveRecord::JDBCError: com.amazonaws.athena.jdbc.NotImplementedException: Method ResultSet.getCharacterStream is not yet implemented: SELECT distinct(device_id) FROM presales.sightings_v3 WHERE DATE(date) BETWEEN DATE('2016-03-01') AND DATE('2016-03-02') AND ( contains(audiences, 1133) OR contains(audiences, 1149) OR contains(audiences, 1184) );
I'm using driver AthenaJDBC41-1.0.0.jar from the AWS docs and my example connection can be seen here.
My guess is that the method ResultSet.getCharacterStream is only used with large results since my other queries work fine.
Ideally I would like this response to contain the query_id or S3 Path vs streaming the big data results. I'm curious how the Athena UI generates a link to the results on S3?
Upvotes: 0
Views: 787
Reputation: 71
You can get the query id from the ResultSet
((AthenaStatementClient)((AthenaResultSet)rs).getClient()).getQueryExecutionId()
With that you can build the S3 Path with
<s3_staging_dir>/<query_id>.csv
Upvotes: 1