MasterOdin
MasterOdin

Reputation: 7886

grep over docker compose logs returns 255 exit code unexpectedly in until loop on CircleCI

For my application, I am using kafka, and I wanted a way to make sure that I was properly awaiting it being ready before trying to run the application e2e tests in our CI test suite. To accomplish this, I used the until bash keyword and looked for a line in the logs to indicate it was ready, started (kafka.server.KafkaServer). The full setup is:

until docker compose logs kafka 2>/dev/null | grep -q "started (kafka.server.KafkaServer)"; do
  sleep 1
done

I am finding that if this loop runs before kafka has been fully started and is ready, then the until loops as expected until kafka is properly ready. However, if this loop runs after the docker containers are started, then the loop never finishes, and from my tests, I get a consistent 255 error code from it. However, if I add the same grep line within my loop, I do see that the line is consistently found and I get $? = 0, so I'm not sure where lies the problem.

For reference, here's the full CircleCI commands I'm running that's hitting the problem:

      - run:
          name: 'Start background services'
          working_directory: services
          background: true
          command: docker-compose up
      - run:
          name: 'Build images for app'
          working_directory: app
          command: docker-compose build
      - run:
          name: 'Wait for services'
          working_directory: services
          command: |
            until docker compose logs kafka | grep -q "started (kafka.server.KafkaServer)"; do
              sleep 1
            done
            echo "Kafka is ready"

I get an error in the until loop never finishing if the build images step takes >= 1min, while the loop works fine if the images are fully cached.

Upvotes: 0

Views: 270

Answers (3)

MasterOdin
MasterOdin

Reputation: 7886

I ended up with the following code to accomplish waiting for the kafka service in CircleCI:

          command: |
            timeout=120
            counter=0
            until docker compose logs kafka 2>/dev/null | grep "started (kafka.server.KafkaServer)" > /dev/null; do
              counter=$((counter+1))
              if [ ${counter} -gt ${timeout} ]; then
                echo "Kafka not ready after ${timeout} seconds"
                exit 1
              fi
              sleep 1
            done
            echo "Kafka is ready - ${counter}"

No idea why grep -q would always fail and I just assume it's something wonky with the CircleCI runner, and didn't dig any deeper.

Upvotes: 0

OneCricketeer
OneCricketeer

Reputation: 191671

You can add a healthcheck in compose. You dont need to grep logs.

Example that creates a topic when broker is healthy

  kafka:
    image: &kafka-image bitnami/kafka:3.4.1
    restart: unless-stopped
    ports:
      - "29092:29092"
    environment:
      BITNAMI_DEBUG: yes
      ALLOW_PLAINTEXT_LISTENER: yes
      KAFKA_ENABLE_KRAFT: yes
      KAFKA_CFG_PROCESS_ROLES: controller,broker
      KAFKA_CFG_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_CFG_NODE_ID: 1
      KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: 1@kafka:9093
      KAFKA_CFG_DELETE_TOPIC_ENABLE: 'true'
      KAFKA_CFG_LOG_RETENTION_HOURS: 72  # 3 days of retention for local testing
      # https://rmoff.net/2018/08/02/kafka-listeners-explained/
      KAFKA_CFG_LISTENERS: INTERNAL://:9092,CONTROLLER://:9093,EXTERNAL://0.0.0.0:29092
      KAFKA_CFG_INTER_BROKER_LISTENER_NAME: INTERNAL
      KAFKA_CFG_ADVERTISED_LISTENERS: INTERNAL://kafka:9092,EXTERNAL://127.0.0.1:29092
      KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT
    healthcheck:
      test: ["CMD", "kafka-topics.sh", "--bootstrap-server=localhost:9092", "--list"]
      start_period: 15s
      interval: 30s

  init-kafka:
    image: *kafka-image
    working_dir: /opt/bitnami/kafka/bin
    entrypoint: /bin/bash
    depends_on:
      kafka:
        condition: service_healthy
    command: |
      kafka-topics.sh --create --if-not-exists --topic foo --replication-factor=1 --partitions=3 --bootstrap-server kafka:9092

Feel free to change the init service to your test container

Upvotes: 0

Johanna
Johanna

Reputation: 11

-q tells grep to exit with a success exit status as soon as one line matching the pattern is found and with failure if no matching line has been found in any of the inputs, and otherwise doesn't print anything on standard output.

Upvotes: 1

Related Questions