Reputation: 21
I am trying to use the AWS Data Pipeline service in the following manner:
The script specified in step 1 (i.e. as part of script uri in the activity) has 2 lines: 1. To copy the S3 bucket data to the instance. 2. Run the python command to execute my program. The AMI I have created is based on Ubuntu instance of ec2 and it consists of some python software and also the code I would like to run.
Now, on initiation of the pipeline I notice that ec2 instance is indeed created and the S3 data is copied and made available to the instance but the python command is not run. The instance is in running state and the pipeline is in waiting for runner state for some time and then data pipeline fails with the message: "Resource stalled".
Can someone please let me know if i am doing something wrong or why doesn't my python code is not being executed or why am I getting the Resource stalled error? The code works fine if I run it manually without the pipeline.
Thanks in advance!!
Upvotes: 2
Views: 2098
Reputation: 410
"Resource stalled" almost always means there is a problem with the setup of your custom AMI. The requirements are documented here. The short bullets:
A custom AMI must meet the following requirements for AWS Data Pipeline to use it successfully for Task Runner:
- Create the AMI in the same region that the instances will run in.
- Ensure that the virtualization type of the AMI is supported by the instance type you plan to use. For example, the I2 and G2 instance types require an HVM AMI and the T1, C1, M1, and M2 instance types require a PV AMI.
- Install the following software:
- Linux
- Bash
- wget
- unzip
- Java 1.6 or newer
- cloud-init
- Create and configure a user account named ec2-user.
Upvotes: 1