Vincent Claes
Vincent Claes

Reputation: 4788

AWS Batch in Privileged mode urllib3.exceptions.ConnectTimeoutError + botocore.exceptions.ConnectTimeoutError

My AWS Batch job in privileged mode has the following issue with boto/botocore:

TimeoutError: timed out
The above exception was the direct cause of the following exception:

urllib3.exceptions.ConnectTimeoutError: (<botocore.awsrequest.AWSHTTPConnection object at 0x7f858aa9b700>, 'Connection to 169.254.170.2 timed out. (connect timeout=2)')

botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: "http://169.254.170.2/v2/credentials/f379b1f3-1673-43b3-9ae7-523b2534be77"

botocore.exceptions.MetadataRetrievalError: Error retrieving metadata: Received error when attempting to retrieve container metadata: Connect timeout on endpoint URL: "http://169.254.170.2/v2/credentials/f379b1f3-1673-43b3-9ae7-523b2534be77"

botocore.exceptions.CredentialRetrievalError: Error when retrieving credentials from container-role: Error retrieving metadata: Received error when attempting to retrieve container metadata: Connect timeout on endpoint URL: "http://169.254.170.2/v2/credentials/f379b1f3-1673-43b3-9ae7-523b2534be77"

What's wrong?

Upvotes: 0

Views: 124

Answers (1)

Vincent Claes
Vincent Claes

Reputation: 4788

Add this to your Dockerfile

RUN update-alternatives --set iptables /usr/sbin/iptables-legacy
RUN update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy

The issue stemmed from the mix-up between different iptables versions (legacy and nftables).

This complication arises within AWS Batch when deploying container images that default to nftables for iptables, such as those based on Ubuntu 22.04.

In our case, the AWS Batch utilized container image was set up to initiate a docker-in-docker upon startup, with the --privileged flag activated to facilitate this operation.

The internal utilization of iptables by the docker daemon prompts the loading of nftables onto the host OS kernel, disrupting the established legacy iptables configurations, including port forwarding, which the AWS ECS Agent relies on.

source: https://repost.aws/questions/QUCFqv7OfoQlygJrmwfkJ24Q/various-aws-apis-fail-due-to-timeout

Upvotes: 0

Related Questions