Joe Monty
Joe Monty

Reputation: 21

Trying to build an automation script on AWS Data Pipeline

I am trying to use the AWS Data Pipeline service in the following manner:

  1. Select the activity type as Shell Command activity with the script uri set (to an s3 bucket) and Stage input set to true.
  2. Set the resource type of the activity as EC2.
  3. Use S3 as a data node.
  4. For the ec2 resource, I have selected the instance type as t2.medium and instance ID as a custom AMI created by me.
  5. Schedule the pipeline to run everyday at 10pm.

The script specified in step 1 (i.e. as part of script uri in the activity) has 2 lines: 1. To copy the S3 bucket data to the instance. 2. Run the python command to execute my program. The AMI I have created is based on Ubuntu instance of ec2 and it consists of some python software and also the code I would like to run.

Now, on initiation of the pipeline I notice that ec2 instance is indeed created and the S3 data is copied and made available to the instance but the python command is not run. The instance is in running state and the pipeline is in waiting for runner state for some time and then data pipeline fails with the message: "Resource stalled".

Can someone please let me know if i am doing something wrong or why doesn't my python code is not being executed or why am I getting the Resource stalled error? The code works fine if I run it manually without the pipeline.

Thanks in advance!!

Upvotes: 2

Views: 2098

Answers (1)

Brian R Armstrong
Brian R Armstrong

Reputation: 410

"Resource stalled" almost always means there is a problem with the setup of your custom AMI. The requirements are documented here. The short bullets:

A custom AMI must meet the following requirements for AWS Data Pipeline to use it successfully for Task Runner:

  • Create the AMI in the same region that the instances will run in.
  • Ensure that the virtualization type of the AMI is supported by the instance type you plan to use. For example, the I2 and G2 instance types require an HVM AMI and the T1, C1, M1, and M2 instance types require a PV AMI.
  • Install the following software:
    • Linux
    • Bash
    • wget
    • unzip
    • Java 1.6 or newer
    • cloud-init
  • Create and configure a user account named ec2-user.

Upvotes: 1

Related Questions