Reputation: 5242
With the same exact Data Pipeline configuration, only differing in the AMI to be used (Amazon Linux vs. Ubuntu), my Data Pipeline execution will succeed in both cases but it will only write logs to S3 when using Amazon Linux.
With Amazon Linux
With Ubuntu
In both cases I login with the same user (ec2-user
, not ubuntu
), for which I properly configured that username for the Ubuntu AMI:
#cloud-config
system_info:
default_user:
name: ec2-user
Moreover, I use the same exact resourceRole
and role
attributes when launching Amazon Linux or Ubuntu pipelines. So that's not the issue.
So apparently Amazon Linux has something needed for writing S3 logs, what could it be?
Upvotes: 3
Views: 886
Reputation: 971
This is because TaskRunner uses a Java library called Joda to generate timestamps for the logging. Some versions of the JRE ship with a buggy version of the Joda jar, so any AMI that uses that version (anything above 6, in my experience so far) won't be able to write logs correctly.
I'd recommend including something like alternatives --set java /usr/lib/jvm/jre-1.6.0-openjdk.x86_64/bin/java
before all your scripts in the ShellCommandActivity -- this fixed the problem for me.
Alternately, you could just always use an AMI instance ID that is known to have Java 6 on it.
Upvotes: 2