leonsPAPA
leonsPAPA

Reputation: 797

AWS ECS Task Throttling Outbound Traffic

We programmed an API app deployed as an ECS fargate task (aws network mode). It makes internal calls to an on-premises server. The task is hosted in a VPC with a private subnet and connects to the on-premises server via a transit gateway.

To simplify the scenario, we conducted tests by directly accessing the private IP of the task. When the test involved only a few calls, the task responded correctly. The test is just a simple POST call with little data coming back and forth.

However, when we increased the number of virtual users to 260 and ramped up the test (i.e., a performance test), we observed that after a minute, the internal call from the task to the on-premises server started timing out. (Usually the call only takes 3sec, and the timeout time is set 1 min)

It was mentioned in some articles that fargate task has some outbound tcp connection limitation, but aws doc did not mention that.

Not sure if it app config issue or infra issue or networking issue.

This is the egress config for the security group which seems good to me.

egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

Is there any app coding issue? Thanks.

This is task definition in json (with some real account, arn..etc removed):

{
  "taskDefinitionArn": "arn",
  "containerDefinitions": [
      {
          "name": "name",
          "image": "...",
          "cpu": 8192,
          "memory": 32768,
          "portMappings": [
              {
                  "containerPort": 5004,
                  "hostPort": 5004,
                  "protocol": "tcp"
              }
          ],
          "essential": true,
          "environment": [
              ...
          ],
          "mountPoints": [],
          "volumesFrom": [],
          "secrets": [
             ...
          ],
          "logConfiguration": {
              "logDriver": "awslogs",
              "options": {
                  "awslogs-group": "/ecs/mygroup",
                  "awslogs-region": "us-west-2",
                  "awslogs-stream-prefix": "ecs"
              }
          },
          "systemControls": []
      }
  ],
  "family": "myfamily",
  "taskRoleArn": "arn",
  "executionRoleArn": "roleArn",
  "networkMode": "awsvpc",
  "revision": 231,
  "volumes": [],
  "status": "ACTIVE",
  "requiresAttributes": [
      {
          "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
      },
      {
          "name": "ecs.capability.execution-role-awslogs"
      },
      {
          "name": "com.amazonaws.ecs.capability.ecr-auth"
      },
      {
          "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
      },
      {
          "name": "ecs.capability.secrets.asm.environment-variables"
      },
      {
          "name": "ecs.capability.increased-task-cpu-limit"
      },
      {
          "name": "com.amazonaws.ecs.capability.task-iam-role"
      },
      {
          "name": "ecs.capability.execution-role-ecr-pull"
      },
      {
          "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
      },
      {
          "name": "ecs.capability.task-eni"
      }
  ],
  "placementConstraints": [],
  "compatibilities": [
      "EC2",
      "FARGATE"
  ],
  "requiresCompatibilities": [
      "FARGATE"
  ],
  "cpu": "8192",
  "memory": "32768",
  "registeredAt": "2024-08-22T18:30:57.544Z",
  "deregisteredAt": "2024-08-22T20:23:45.607Z",
  "registeredBy": "...",
  "tags": [
    ....
  ]
}

Upvotes: 2

Views: 132

Answers (0)

Related Questions