Chris Shenton
Chris Shenton

Reputation: 125

Docker Compose deploy to ECS cannot resolve service names (YELB)

I'my deploying a simple Compose file with a Flask app server talking to PSQL database in another container; works locally fine.

When using the new-ish Docker Compose deploy to ECS context, it fails: the app server cannot DNS resolve the name of the "psql" server. I see it created a CloudMap with entries for my service name, but to no avail.

Docker version 20.10.8, build 3967b7d

So I followed YELB tutorial by @mreferre (https://aws.amazon.com/de/blogs/containers/deploy-applications-on-amazon-ecs-using-docker-compose/), and it also deployed locally, but failed on "docker compose up":

YelbuiService CREATE_FAILED Resource creation cancelled

I see in his docker-compose.yml this critical note:

  networks:
    yelb-network:
      driver: bridge # a user defined bridge is required; the default bridge network doesn't support name resolution

But everything I've read says that Compose to ECS uses FARGATE, and that requires an awsvpc network, not a bridge. And before it disappeared, the AWS console showed my YELB Tasks were using awsvpc networks, not bridge.

I defined a bridge network in my Flask/PSQL app anyway and did another docker compose up and noticed this (which probably scrolled off my YELB screen):

  WARNING networks.driver: unsupported attribute

So it seems awsvpc does NOT support Service Discovery by DNS resolution, and we cannot use bridge for Docker to ECS, so I'm stuck.

Or am I missing something?

Do I have to use the separate ecs-cli tools to deploy my Compose file? It's a lot more hairy than the sweet docker compose up we know and love, and requires me to do a lot more work.

Any other guidance?

Thanks for your help.

My repo in progress is at https://github.com/shentonfreude/compose-ecs-flask-psql but for convenience the docker-compose.yml is:

version: "3"                    # ECS may not like decimal suffix"

x-aws-logs_retention: 7
x-aws-vpc: vpc-01234567890123456 # in ue1


networks:
  # YELB says: user defined bridge required; default bridge network doesn't support name resolution
  flasknet:
    driver: bridge 

services:

  psql:
    container_name: psql
    image: postgres:9.6.2-alpine
    environment: 
      POSTGRES_USER:     flaskapp
      POSTGRES_DB:       flaskapp
      POSTGRES_PASSWORD: flaskapp
    healthcheck:
      test: "pg_isready --username=flaskapp && psql --username=flaskapp --list"
      interval: 5s
      timeout: 5s
      retries: 5
    networks:
      - flasknet

  flaskapp:
    container_name: flaskapp
    image: 314159265358.dkr.ecr.us-east-1.amazonaws.com/cshenton/flaskapp:latest
    environment:
      # Neither bare service "psql",
      # FQDN "psql.psql.compose-ecs-flask-psql.local",
      # nor awsvpc "localhost" work.
      DB_HOST:     psql
      DB_PORT:     5432
      DB_NAME:     flaskapp
      DB_PASSWORD: flaskapp
      DB_USER:     flaskapp
    ports:
      - "80:80"
    depends_on:
      psql:
        condition: service_healthy
    deploy:
      resources:
        limits:
          cpus: "0.5"
          memory: 512M
    networks:
      - flasknet

Upvotes: 2

Views: 1470

Answers (2)

Chris Shenton
Chris Shenton

Reputation: 125

As @mreferre points out the compose up works fine with an Default VPC, however early (2013) accounts don't have this, and many organizations remove these.

You can create a new VPC, get the name, and set it in your Compose file like x-aws-vpc: vpc-deadc0decafebeef.

But you MUST have one feature turned on in the VPC for DNS to resolve the Compose service names: EnableDnsHostnames: true. Without this, an app server (for example) will not be able to resolve its database server.

Below is a minimal CloudFormation for a VPC with 2 public subnets (required by the ALB):

# VPC with IGW, Default Route, 2 public subnets.
# VPC must EnableDnsHostnames for Fargate Service Discovery by Docker service name.
# Subnets do not need MapPublicIpOnLaunch.

Resources:
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: 10.0.0.0/16
      EnableDnsHostnames: true  # critical for Service Discovery
      Tags:
      - Key: Name
        Value: !Join ['', [!Ref "AWS::StackName", "-VPC" ]]
  InetGW:
    Type: AWS::EC2::InternetGateway
    Properties:
      Tags:
      - Key: Name
        Value: !Join ['', [!Ref "AWS::StackName", "-IGW" ]]
  VPCGWAttachment:
    Type: AWS::EC2::VPCGatewayAttachment
    Properties:
      InternetGatewayId: !Ref InetGW
      VpcId: !Ref VPC
  RouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId: !Ref VPC
      Tags:
      - Key: Name
        Value: !Join ['', [!Ref "AWS::StackName", "-RouteTable" ]]
  InternetRoute:
    Type: AWS::EC2::Route
    Properties:
      DestinationCidrBlock: 0.0.0.0/0
      GatewayId: !Ref InetGW
      RouteTableId: !Ref RouteTable

  Subnet1:
    Type: AWS::EC2::Subnet
    Properties:
      AvailabilityZone: !Select [ 0, !GetAZs ]
      CidrBlock: 10.0.0.0/20
      VpcId: !Ref VPC
      Tags:
      - Key: Name
        Value: !Join ['', [!Ref "AWS::StackName", "-Subnet1" ]]
  Subnet2:
    Type: AWS::EC2::Subnet
    Properties:
      AvailabilityZone: !Select [ 1, !GetAZs ]
      CidrBlock: 10.0.16.0/20
      VpcId: !Ref VPC
      Tags:
      - Key: Name
        Value: !Join ['', [!Ref "AWS::StackName", "-Subnet2" ]]

  Subnet1RouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      RouteTableId: !Ref RouteTable
      SubnetId: !Ref Subnet1
  Subnet2RouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      RouteTableId: !Ref RouteTable
      SubnetId: !Ref Subnet2

Outputs:
  VpcID:
    Description: The ID of the created VPC
    Value: !Ref VPC

Upvotes: 1

mreferre
mreferre

Reputation: 6073

@mreferre here :)

Thanks for trying this out.

First and foremost bridge is the right thing to use. Docker will do its magic to do the mapping from bridge to awsvpc as part of the integration (much like they map a compose service to an ECS service etc). IIRC the warning message is just because you can't use alternative drivers so they just ignore the entry (and since bridge is the default you could remove that line and it will do the same mapping - with the bonus that you won't see the warning).

As per the psql name not being resolved, that's weird. The compose integration should build the correct wiring (yes using CloudMap etc) so that each service can resolve every other service by its name. In other words the flaskapp service should be able to resolve the psql service.

This is the mechanism I use in Yelb so that the app server can connect to yelb-db (which is the hard coded endpoint in the app server code).

I can't test your app cause I don't have your app container. But am I reading it right that you can't even compose up my yelb compose file (link). That would be weird as that should work out of the box locally and on ECS.

[UPDATE 1] I can confirm that the problem is related to your VPC configuration (or at least it's not related to the compose syntax you are using). I have cloned your repo and composed up your app (worked fine locally). I then pushed the image into my ECR account and edited your compose file in two ways: 1) to point to my ECR image and 2) commented out the x-aws-vpc entry (this forces compose to use the default VPC in your account). When I composed it up in the ECS context it deployed on aws and using the ALB link created I could use the createtable, insert and select APIs just fine.

Can you try remove the x-aws-vpc entry and confirm we can pin down to it the root of the problem? We can figure out why later.

Upvotes: 2

Related Questions