Reputation: 454
I'm setting up functional tests for applications running with Spark Streaming and Kafka. The steps to be done are
What is the professional way to do this other than simple bash script?
I think this is quite general question not related strictly to Spark Streaming and Kafka. Maybe there are some testing frameworks which support setting up the environment, running multiple processes in parallel and data validation/assertions.
Upvotes: 0
Views: 969
Reputation: 2216
Consider using Citrus (http://citrusframework.org/) test framework which could be the all-in-one test framework for you.
Also consider to use Fabric8 Docker Maven plugin (https://github.com/fabric8io/docker-maven-plugin) for setting up the Docker test environment before Citrus tests are executed within same build run.
Here is an example for both tools working together for automated integration testing: https://github.com/christophd/citrus-samples/tree/master/sample-docker
Upvotes: 1
Reputation: 110
Maybe there are some testing frameworks which support setting up the environment, running multiple processes in parallel and data validation/assertions.
Unfortuanetely there is no all-in-one framework out there.
One-line answer would be: use docker-compose with the simplest unit-testing or gherkin-based framework of your choice.
Considering the steps above as:
Start the env
Generate Kafka messages / Validate
Shut down the env
Docker-Compose would be the best choice for the steps #1 and #3.
version: '2'
services:
kafka:
# this container already has zookeeper built in
image: spotify/kafka
ports:
- 2181:2181
- 9092:9092
# its just some mock-spark container, you'll have to replace it with
# docker container that can host your spark-app
spark:
image: epahomov/docker-spark:lightweighted
depends_on:
- kafka
The idea of the compose file is that you can start your env with one command:
docker-compose up
And the environment setup will be pretty much portable across dev machines and build servers.
For the step #2 any test framework will do.
The scenario would look like:
Talking about frameworks:
Scala: Scalatest. There you can have a good spectrum of Async Assertions and parallel processing.
Python: Behave (be careful with multiprocessing there) or unit-testing framework such as pytest
Do not let the naming "unit-testing framework" confuse you. Only test environment defines if a test becomes unit, modular, system or integration like, not a tool.
If a person uses unit-test framework and writes there
MyZookeeperConnect("192.168.99.100:2181")
its not a unit test anymore, even unit test framework can't help it :)
To glue steps #1, #2, #3 together - simple bash would be my choice.
Upvotes: 1