Reputation: 6786
Detail: We have dockerized Kafka
, Cassandra
and Spark
, using wurstmeister/kafka
, strapdata/elassandra
and bde2020/spark-master
images in docker-compose.
What we want to do is to connect the following using connectors:
Kafka stream to Spark stream
Spark stream to Cassandra
Kafka stream to Cassandra
The problem is that we don't know whether it works fine or not, because these technologies sounds new for us.
Graphical Representation:
Important Files:
docker-compose.yml
version: '2'
services:
spark:
container_name: spark
image: bde2020/spark-master
ports:
- 9180:8080
- 9177:7077
- 9181:8081
links:
- elassandra
volumes:
hosein:/var/lib/docker/volumes/data/python
- /home/mostafa/Desktop/kafka-test/together/cassandra/mostafa-hosein:/var/lib/docker/volumes/data/python
elassandra:
image: strapdata/elassandra
container_name: elassandra
build: /home/mostafa/Desktop/kafka-test/together/cassandra
env_file:
- /home/mostafa/Desktop/kafka-test/together/cassandra/conf/cassandra.env
volumes:
- /home/mostafa/Desktop/kafka-test/together/cassandra/jarfile:/var/lib/docker/volumes/data/_data
ports:
- '7000:7000'
- '7001:7001'
- '7199:7199'
- '9042:9042'
- '9142:9142'
- '9160:9160'
- '9200:9200'
- '9300:9300'
zookeeper:
image: wurstmeister/zookeeper
container_name: zookeeper
ports:
- "2181:2181"
kafka:
build: .
container_name: kafka
links:
- zookeeper
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_HOST_NAME: localhost
KAFKA_ADVERTISED_PORT: 9092
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_OPTS: -javaagent:/usr/app/jmx_prometheus_javaagent.jar=7071:/usr/app/prom-jmx-agent-config.yml
CONNECTORS: elassandra
volumes:
- /var/run/docker.sock:/var/run/docker.sock
depends_on:
- elassandra
kafka_connect-cassandra:
image: datamountaineer/kafka-connect-cassandra
container_name: kafka-connect-cassandra
ports:
- 8083:8083
- 9102:9102
environment:
- connect.cassandra.contact.points=localhost
- KAFKA_ZOOKEEPER_CONNECT = "zookeeper:2181"
- KAFKA_ADVERTISED_LISTENERS= "kafka:9092"
- connect.cassandra.port=9042
- connector.class=com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraSinkConnector
- tasks.max=1
depends_on:
- kafka
- elassandra
Dockerfile
FROM wurstmeister/kafka
ADD prom-jmx-agent-config.yml /usr/app/prom-jmx-agent-config.yml
ADD jmx_prometheus_javaagent-0.10.jar /usr/app/jmx_prometheus_javaagent.jar
COPY wait-for-it.sh /wait-for-it.sh
RUN chmod +x /wait-for-it.sh
CMD ["/wait-for-it.sh", "zookeeper:2181", "--", "start-kafka.sh"]
Example: As an example I have added CONNECTOR: elassandra
to environment variables of kafka's container but I haven't faced with any error and not sure whether it is a valid environment variable or not!
How do we can validate environment variables and test the connectors working fine?
Upvotes: 1
Views: 1897
Reputation: 192043
As mentioned, CONNECTORS
is not a valid variable for the Kafka container. Kafka Connect is a separate service from the broker, so needs to be a separate container.
Kafka Connect exposes a REST API at port 8083.
You need to perform HTTP requests using curl
, Postman, etc. to provide Connectors; they cannot be loaded just from variables.
I am not immediately aware of any specific properties needed for the Datamountainer containers, but they are built on top of the Confluent images, and you can find all those environment variables here - https://github.com/confluentinc/cp-docker-images/blob/5.1.2-post/examples/cp-all-in-one/docker-compose.yml#L64-L86
These are for Kafka container, not Kafka Connect since they start with KAFKA_
- KAFKA_ZOOKEEPER_CONNECT = "zookeeper:2181"
- KAFKA_ADVERTISED_LISTENERS= "kafka:9092
And these are for the connector properties (which would be POSTed via JSON), not Environment variables.
- connect.cassandra.contact.points=localhost
- connect.cassandra.port=9042
- connector.class=com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraSinkConnector
- tasks.max=1
Then, localhost
shouldn't be used anywhere in these properties; if you want Connect container to reach Cassandra, you would use "connect.cassandra.contact.points": "elassandra"
(the docker service name)
Upvotes: 1