deprecated
deprecated

Reputation: 5242

Data Pipeline S3 logs not written (only written if using Amazon Linux)

With the same exact Data Pipeline configuration, only differing in the AMI to be used (Amazon Linux vs. Ubuntu), my Data Pipeline execution will succeed in both cases but it will only write logs to S3 when using Amazon Linux.

With Amazon Linux

enter image description here

With Ubuntu

enter image description here

In both cases I login with the same user (ec2-user, not ubuntu), for which I properly configured that username for the Ubuntu AMI:

#cloud-config
system_info:
  default_user:
    name: ec2-user

Moreover, I use the same exact resourceRole and role attributes when launching Amazon Linux or Ubuntu pipelines. So that's not the issue.

So apparently Amazon Linux has something needed for writing S3 logs, what could it be?

Upvotes: 3

Views: 886

Answers (1)

shrikant
shrikant

Reputation: 971

This is because TaskRunner uses a Java library called Joda to generate timestamps for the logging. Some versions of the JRE ship with a buggy version of the Joda jar, so any AMI that uses that version (anything above 6, in my experience so far) won't be able to write logs correctly.

I'd recommend including something like alternatives --set java /usr/lib/jvm/jre-1.6.0-openjdk.x86_64/bin/java before all your scripts in the ShellCommandActivity -- this fixed the problem for me.

Alternately, you could just always use an AMI instance ID that is known to have Java 6 on it.

Upvotes: 2

Related Questions