Reputation: 15
I was trying to put data from kafka to clickhouse with filebeat, my configs looks like below
filebeat conf
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/nginx/access.log
output.kafka:
# specifying filebeat to take timestamp and message fields, other wise it
# take the lines as json and publish to kafka
codec.format:
string: '%{[@timestamp]} %{[message]}'
# kafka
# publishing to 'log' topic
hosts: ["kafka:9092"]
topic: 'myfirst'
partition.round_robin:
reachable_only: false
required_acks: 1
compression: gzip
max_message_bytes: 1000000
in the Kafka im getting my log in topic and Everything is fine, a part that the data are inserted to kafka topic like this
2021-01-01T21:51:25.225Z {"remote_addr": "192.168.222.1","remote_user": "-","time_local": "01/Jan/2021:21:51:17 +0000","request": "GET / HTTP/1.1","status": "304","body_bytes_sent": "0","http_referer": "-","http_user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"}
and i create clickhouse tables and MATERIALIZED
CREATE TABLE accesslog (
...
) ENGINE = Kafka SETTINGS kafka_broker_list = 'kafka:9092',
but query output in clickhouse was like this without data!why?
┌─remote_addr─┬─remote_user─┬─time_local─┬───────date─┬─request─┬─status─┬─body_bytes_sent─┬─http_referer─┬─http_user_agent─┐
│ │ │ │ 0000-00-00 │ │ 0 │ 0 │ │ │
│ │ │ │ 0000-00-00 │ │ 0 │ 0 │ │ │
│ │ │ │ 0000-00-00 │ │ 0 │ 0 │ │ │
└─────────────┴─────────────┴────────────┴────────────┴─────────┴────────┴─────────────────┴──────────────┴─────────────────┘
Upvotes: 1
Views: 1620
Reputation: 15226
It looks like the issue is the wrong Kafka broker address. Should be used not external address kafka:9092 but internal kafka:19092:
CREATE TABLE accesslog (
..
) ENGINE = Kafka SETTINGS kafka_broker_list = 'kafka:19092', ..
Reproducing steps:
Kafka-side:
# run shell in Kafka container
docker exec -it kafka bash
# create topic
kafka-topics --create --topic myfirst --partitions 1 --replication-factor 1 --bootstrap-server kafka:19092
# check topic
# kafka-topics --describe --topic myfirst --bootstrap-server kafka:19092
# add events to the topic
kafka-console-producer --topic myfirst --broker-list kafka:19092
# event body: {"remote_addr": "192.168.222.1","remote_user": "-","time_local": "01/Jan/2021:21:51:17 +0000","request": "GET / HTTP/1.1","status": "304","body_bytes_sent": "0","http_referer": "-","http_user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"}
..
ClickHouse-side:
SELECT *
FROM accesslog
/*
┌─remote_addr───┬─remote_user─┬─time_local─────────────────┬─request────────┬─status─┬─body_bytes_sent─┬─http_referer─┬─http_user_agent────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 192.168.222.1 │ - │ 01/Jan/2021:21:51:17 +0000 │ GET / HTTP/1.1 │ 304 │ 0 │ - │ Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36 │
..
*/
Excerpt from docker-compose.yml:
..
kafka:
image: confluentinc/cp-kafka:5.2.2
container_name: kafka
restart: unless-stopped
hostname: kafka
depends_on:
- zookeeper
environment:
KAFKA_ADVERTISED_LISTENERS: LISTENER_DOCKER_INTERNAL://kafka:19092,LISTENER_DOCKER_EXTERNAL://${DOCKER_HOST_IP:-x.x.x.x}:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: LISTENER_DOCKER_INTERNAL:PLAINTEXT,LISTENER_DOCKER_EXTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: LISTENER_DOCKER_INTERNAL
KAFKA_ZOOKEEPER_CONNECT: "zookeeper:2181"
KAFKA_BROKER_ID: 1
KAFKA_LOG4J_LOGGERS: "kafka.controller=INFO,kafka.producer.async.DefaultEventHandler=INFO,state.change.logger=INFO"
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
ports:
- 9092:9092
networks:
- net1
..
Upvotes: 2