Reputation: 1768
I would like to create a dockerfile that builds a Cassandra image with a keyspace and schema already there when the image starts.
In general, how do you create a Dockerfile that will build an image that includes some step(s) that can't really be done until the container is running, at least the first time?
Right now, I have two steps: build the cassandra image from an existing cassandra Dockerfile that maps a volume with the CQL schema files into a temporary directory, and then run docker exec with cqlsh to import the schema after the image has been started as a container.
But that doesn't create an image with the schema - just a container. That container could be saved as an image, but that's cumbersome.
docker run --name $CASSANDRA_NAME -d \
-h $CASSANDRA_NAME \
-v $CASSANDRA_DATA_DIR:/data \
-v $CASSANDRA_DIR/target:/tmp/schema \
tobert/cassandra:2.1.7
then
docker exec $CASSANDRA_NAME cqlsh -f /tmp/schema/create_keyspace.cql
docker exec $CASSANDRA_NAME cqlsh -f /tmp/schema/schema01.cql
# etc
This works, but it makes it impossible to use with tools like Docker compose since linked containers/services will start up too and expect the schema to be in place.
I saw one attempt where the cassandra process as attempted to be started in the background in the Dockerfile during build, then cqlsh run, but I don't think that worked too well.
Upvotes: 8
Views: 9295
Reputation: 6559
Another approach used by our team is create schema on server init. Our java code test if exist the SCHEMA, if not (new environment, new deployment) create it.
Same for every new TABLE, automatic CREATE TABLE creates required new tables for new data entities when they run in any new cluster (other developer local, preproduction, production).
All this code is isolated inside our DataDriver classes for portability, in case we change Cassandra for another DB in some client or project.
This prevent a lot of hassle both for admins and for developers. This approach is even valid for initial data loading, we use on tests.
Upvotes: 0
Reputation: 129
Make a docker file Dockerfile_CAS:
FROM cassandra:latest
COPY ddl.cql docker-entrypoint-initdb.d/
COPY docker-entrypoint.sh /docker-entrypoint.sh
RUN ls -la *.sh; chmod +x *.sh; ls -la *.sh
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["cassandra", "-f"]
edit docker-entrypoint.sh, add
for f in docker-entrypoint-initdb.d/*; do case "$f" in *.sh) echo "$0: running $f"; . "$f" ;; *.cql) echo "$0: running $f" && until cqlsh -f "$f"; do >&2 echo "Cassandra is unavailable - sleeping"; sleep 2; done & ;; *) echo "$0: ignoring $f" ;; esac echo done
above exec "$@"
docker build -t suraj1287/cassandra -f Dockerfile_CAS .
and rebuild the image...
Upvotes: 1
Reputation: 8812
Ok I had this issue and someone advised me some strategy to deal with:
create a shell script that will be used as the last command to start Cassandra
a. start cassandra with $CASSANDRA_HOME/bin/cassandra
b. IF there is a $CASSANDRA_HOME/data/data/your_keyspace-xxxx folder and it's not empty, do nothing more
c. Else
1. sleep some time to allow the server to listen on port 9042
2. when port 9042 is listening, execute the .cql script to load csv files
I found this procedure rather cumbersome but there seems to be no other way around. For Cassandra hands-on lab, I found it easier to create a VM image using Vagrant and Ansible.
Upvotes: 6