Reputation: 211
Trying to query a large dataset from Athena using AWS data wrangler. The query fails for large datasets. This is for setting up a datawrangler pipeline using UI in AWS studio trying to add a Athena Source.
Some observations:
Anyone encountered a similar problem? any timeout settings for data wrangler?
Upvotes: 2
Views: 545
Reputation: 171
I had the same issue with the Snowflake as a source. I created a support ticket and according to them they are working to enhance performance on large datasets.
As a workaround export the flow to a SageMaker pipeline and run it as a Processing Job on multiple instances as it runs in a distributed environment using Spark.
Upvotes: 0