Reputation: 416
I am trying to use AWS Glue to run an ETL job that fetches data from Redshift to S3.
When I run a crawler it successfully connects to Redshift and fetches schema information. Relevant logs are created under a log group aws-glue/crawlers.
When I run the ETL job, it is supposed to create a log stream under log groups aws-glue/jobs/output and aws-glue/jobs/error, but it fails to create such log streams, and eventually the job too fails.
( I am using AWS managed AWSGlueServiceRole policy for Glue service)
Since it does not produce any logs, it is difficult to identify the reason for ETL job failure. I would appreciate it if you could help me resolve this issue.
Upvotes: 4
Views: 6957
Reputation: 459
Most of the time this has to do with your AWS service not having the correct permissions (yes, even for just writing logs!).
Adding something like this to the Glue role might do the trick:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
}
]
}
Upvotes: 7
Reputation: 57
I would make sure that your Endpoint and VPC is set up correctly via these instructions:
http://docs.aws.amazon.com/glue/latest/dg/setup-vpc-for-glue-access.html
I had my inbound rules set up correctly but did not set up the outbound rules which is what I think the issue was.
Upvotes: 0