Reputation: 8414
I've followed DataStax's guide on best practices for using DSE with Docker, but I've run into the following bug using all of the default setup scripts and Dockerfiles provided by DataStax.
Caused by: java.lang.RuntimeException: Failed to save custom DSE Hadoop config
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:310) ~[dse-hadoop-5.0.3.jar:5.0.3]
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:174) ~[dse-hadoop-5.0.3.jar:5.0.3]
at com.datastax.bdp.ConfigurationWriterPlugin.onActivate(ConfigurationWriterPlugin.java:20) ~[dse-hadoop-5.0.3.jar:5.0.3]
at com.datastax.bdp.plugin.PluginManager.initialize(PluginManager.java:377) ~[dse-core-5.0.3.jar:5.0.3]
at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:306) ~[dse-core-5.0.3.jar:5.0.3]
... 7 common frames omitted
Caused by: java.io.IOException: Directory not writable: /opt/dse/resources/hadoop/conf
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.saveConfiguration(CassandraJobConf.java:466) ~[dse-hadoop-5.0.3.jar:5.0.3]
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.saveDseHadoopConfiguration(CassandraJobConf.java:345) ~[dse-hadoop-5.0.3.jar:5.0.3]
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:300) ~[dse-hadoop-5.0.3.jar:5.0.3]
... 11 common frames omitted
Unable to start DSE server: Unable to activate plugin com.datastax.bdp.ConfigurationWriterPlugin
com.datastax.bdp.plugin.PluginManager$PluginActivationException: Unable to activate plugin com.datastax.bdp.ConfigurationWriterPlugin
at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:327)
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:259)
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:169)
at com.datastax.bdp.plugin.PluginManager.preStart(PluginManager.java:77)
at com.datastax.bdp.server.DseDaemon.preStart(DseDaemon.java:490)
at com.datastax.bdp.server.DseDaemon.start(DseDaemon.java:462)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:583)
at com.datastax.bdp.DseModule.main(DseModule.java:91)
Caused by: java.lang.RuntimeException: Failed to save custom DSE Hadoop config
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:310)
at com.datastax.bdp.hadoop.mapred.CassandraJobConf.writeDseHadoopConfig(CassandraJobConf.java:174)
at com.datastax.bdp.ConfigurationWriterPlugin.onActivate(ConfigurationWriterPlugin.java:20)
at com.datastax.bdp.plugin.PluginManager.initialize(PluginManager.java:377)
at com.datastax.bdp.plugin.PluginManager.activateDirect(PluginManager.java:306)
... 7 more
Caused by: java.io.IOException: Directory not writable: /opt/dse/resources/hadoop/conf
Error is pretty straight forward, tried to address it by adding some additional chmod
calls in the Dockerfile
to no avail.
# Provided without any warranty, these files are intended
# to accompany the whitepaper about DSE on Docker and are
# not intended for production and are not actively maintained.
# Loosely based on docker-cassandra by the fine folk at Spotify
# -- https://github.com/spotify/docker-cassandra/
# Loosely based on cassandra-docker by the one and only Al Tobey
# -- https://github.com/tobert/cassandra-docker/
# base yourself on any ubuntu 14.04 image containing JDK8
# official Docker Java images are distributed with OpenJDK
# Datastax certifies its product releases specifically
# on the Oracle/Sun JVM, so YMMV with OpenJDK
FROM nimmis/java:oracle-8-jdk
# Avoid ERROR: invoke-rc.d: policy-rc.d denied execution of start.
RUN echo "#!/bin/sh\nexit 0" > /usr/sbin/policy-rc.d
RUN export DEBIAN_FRONTEND=noninteractive && \
apt-get update && \
apt-get -y install adduser \
curl \
lsb-base \
procps \
zlib1g \
gzip \
python \
python-support \
sysstat \
ntp bash tree && \
rm -rf /var/lib/apt/lists/*
# grab gosu for easy step-down from root
RUN curl -o /bin/gosu -SkL "https://github.com/tianon/gosu/releases/download/1.4/gosu-$(dpkg --print-architecture)" \
&& chmod +x /bin/gosu
# DSE tarball can be download into the folder where Dockerfile is
# wget --user=$USER --password=$PASS http://downloads.datastax.com/enterprise/dse-5.0.0-bin.tar.gz
# you may want to replace dse-5.0.0-bin.tar.gz with the corresponding downloaded package name. When
# downloaded, please remove the version number part of the filename (or create a symlink), so the
# resulting file is named dse-bin.tar.gz (that way the docker file itself remains version independent).
#
# DataStax Agent debian package can be downloaded from
# wget --user=$USER --password=$PASS http://debian.datastax.com/enterprise/pool/datastax-agent_6.0.0_all.deb
# you may want to replace the specific version with the corresponding downloaded package name. When
# downloaded, please remove the version number part of the filename (or create a symlink), so the
# resulting file is named datastax-agent_all.deb (that way the docker file itself remains version
# independent).
ADD dse.tar.gz /opt
ADD datastax-agent_all.deb /tmp
ENV DSE_HOME /opt/dse
RUN ln -s /opt/dse* $DSE_HOME
# keep data here
VOLUME /data
# and logs here
VOLUME /logs
VOLUME /opt/dse
# create a dedicated user for running DSE node
RUN groupadd -g 1337 cassandra && \
useradd -u 1337 -g cassandra -s /bin/bash -d $DSE_HOME cassandra && \
chown -R cassandra:cassandra /opt/dse*
RUN chmod r+w -R /opt/dse/
# install the agent
RUN dpkg -i /tmp/datastax-agent_all.deb
# starting node using custom entrypoint that configures paths, interfaces, etc.
COPY scripts/dse-entrypoint /usr/local/bin/
RUN chmod +x /usr/local/bin/dse-entrypoint
ENTRYPOINT ["/usr/local/bin/dse-entrypoint"]
# Running any other DSE/C* command should be done on behalf dse user
# Perform that using a generic command laucher
COPY scripts/dse-cmd-launcher /usr/local/bin/
RUN chmod +x /usr/local/bin/dse-cmd-launcher
# link dse commands to the launcher
RUN for cmd in cqlsh dsetool nodetool dse cassandra-stress; do \
ln -sf /usr/local/bin/dse-cmd-launcher /usr/local/bin/$cmd ; \
done
# the detailed list of ports
# http://docs.datastax.com/en/datastax_enterprise/5.0/datastax_enterprise/sec/secConfFirePort.html
# Cassandra
EXPOSE 7000 9042 9160
# Solr
EXPOSE 8983 8984
# Spark
EXPOSE 4040 7080 7081 7077
# Hadoop
EXPOSE 8012 50030 50060 9290
# Hive/Shark
EXPOSE 10000
# Graph
The last place where there might be an answer to fixing this issue might be the startup script used to actually launch DSE when this container starts.
#!/bin/sh
# Provided without any warranty, these files are intended
# to accompany the whitepaper about DSE on Docker and are
# not intended for production and are not actively maintained.
# Bind the various services
# These should be updated on every container start
if [ -z ${IP} ]; then
IP=`hostname --ip-address`
fi
echo $IP > /data/ip.address
# create directories for holding the node's data, logs, etc.
create_dirs() {
local base_dir=$1;
mkdir -p $base_dir/data/commitlog
mkdir -p $base_dir/data/saved_caches
mkdir -p $base_dir/data/hints
mkdir -p $base_dir/logs
}
# tweak the cassandra config
tweak_cassandra_config() {
env="$1/cassandra-env.sh"
conf="$1/cassandra.yaml"
base_data_dir="/data"
# Set the cluster name
if [ -z "${CLUSTER_NAME}" ]; then
printf " - No cluster name provided; skipping.\n"
else
printf " - Setting up the cluster name: ${CLUSTER_NAME}\n"
regexp="s/Test Cluster/${CLUSTER_NAME}/g"
sed -i -- "$regexp" $conf
fi
# Set the commitlog directory, and various other directories
# These are done only once since the regexep matches will fail on subsequent
# runs.
printf " - Setting up directories\n"
regexp="s|/var/lib/cassandra/|$base_data_dir/|g"
sed -i -- "$regexp" $conf
regexp="s/^listen_address:.*/listen_address: ${IP}/g"
sed -i -- "$regexp" $conf
regexp="s/rpc_address:.*/rpc_address: ${IP}/g"
sed -i -- "$regexp" $conf
# seeds
if [ -z "${SEEDS}" ]; then
printf " - Using own IP address ${IP} as seed.\n";
regexp="s/seeds:.*/seeds: \"${IP}\"/g";
else
printf " - Using seeds: $SEEDS\n";
regexp="s/seeds:.*/seeds: \"${IP},${SEEDS}\"/g"
fi
sed -i -- "$regexp" $conf
# JMX
echo "JVM_OPTS=\"\$JVM_OPTS -Djava.rmi.server.hostname=127.0.0.1\"" >> $env
}
tweak_dse_in_sh() {
# point C* logs dir to the created volume
sed -i -- "s|/var/log/cassandra|/logs|g" "$1/dse.in.sh"
}
tweak_spark_config() {
sed -i -- "s|/var/lib/spark/|/data/spark/|g" "$1/spark-env.sh"
sed -i -- "s|/var/log/spark/|/logs/spark/|g" "$1/spark-env.sh"
mkdir -p /data/spark/worker
mkdir -p /data/spark/rdd
mkdir -p /logs/spark/worker
}
tweak_agent_config() {
[ -d "/var/lib/datastax-agent" ] && cat > /var/lib/datastax-agent/conf/address.yaml <<EOF
stomp_interface: ${STOMP_INTERFACE}
use_ssl: 0
local_interface: ${IP}
hosts: ["${IP}"]
cassandra_install_location: /opt/dse
cassandra_log_location: /logs
EOF
chown cassandra:cassandra /var/lib/datastax-agent/conf/address.yaml
}
setup_node() {
printf "* Setting up node...\n"
printf " + Setting up node...\n"
create_dirs
tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf"
tweak_dse_in_sh "$DSE_HOME/bin"
tweak_spark_config "$DSE_HOME/resources/spark/conf"
tweak_agent_config
chown -R cassandra:cassandra /data /logs /conf
# mark that we tweaked configs
touch "$DSE_HOME/tweaked_configs"
printf "Done.\n"
}
# if marker file doesn't exist, setup node
[ ! -f "$DSE_HOME/tweaked_configs" ] && setup_node
[ -f "/etc/init.d/datastax-agent" ] && /etc/init.d/datastax-agent start
exec gosu cassandra "$DSE_HOME/bin/dse" cassandra -f "$@"
And here's the commandline arguments I'm using to launch a single DSE instance via Docker:
#!/bin/bash
# Used to start a single DSE node that has both Spark and Cassandra running on it
OPSC_CONTAINER=$1
if [ -z "$OPSC_CONTAINER" ]; then
echo "usage: start_docker_cluster.sh OPSCContainerName"
echo " OPSCContainerName mandatory name of the container running OpsCenter"
exit 1
fi
[ -z "$CLUSTER_NAME" ] && CLUSTER_NAME="Test_Cluster"
STOMP_INTERFACE=`docker exec $OPSC_CONTAINER hostname -I`
docker run -p 7080:7080 -p 4040:4040 -p 7077:7077 -p 9042:9042 --link $OPSC_CONTAINER -d -e CLUSTER_NAME="$CLUSTER_NAME" -e STOMP_INTERFACE="$STOMP_INTERFACE" --name dse dse -k -t
The -k -t
flags indicate that we're going to be launching both Hadoop and Spark for this container. I've dropped the -t
flag and still had this configuraiton error occur even without it.
What do I need to do to make the /opt/dse/resources/hadoop/conf
directory writable so DSE can successfully boot?
Upvotes: 2
Views: 807
Reputation: 868
Doing this:
I added chown -RHh cassandra:cassandra /opt/dse in the setup_node() portion of DSE Startup Script (Called by Docker container on startup)
as answered by Max worked for me, but instead of his issue I got
Unable to activate plugin com.datastax.bdp.plugin.DseFsPlugin
(...)
java.io.IOException: Failed to create work directory: /var/lib/dsefs
So I had to turn my setup_node() to this
setup_node() {
printf "* Setting up node...\n"
printf " + Setting up node...\n"
create_dirs
tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf"
tweak_dse_in_sh "$DSE_HOME/bin"
tweak_spark_config "$DSE_HOME/resources/spark/conf"
tweak_agent_config
chown -R cassandra:cassandra /data /logs /conf
mkdir /var/lib/dsefs
chown -RHh cassandra:cassandra /opt/dse /var/lib/dsefs
# mark that we tweaked configs
touch "$DSE_HOME/tweaked_configs"
printf "Done.\n"
}
Upvotes: 0
Reputation: 66
Adding 'chown -RHh cassandra:cassandra /opt/dse' to the entrypoint script solved my problem of not being able to write to /opt/dse/resources/hadoop/conf.
Re. ERROR 04:15:04,789 SPARK-WORKER Logging.scala:74 - Failed to create work directory /var/lib/spark/worker
Check your spark-env.sh, and see your directory mappings. In my case, i have mounted two external volumes - /data and /logs. Both these directories are owned by cassandra:cassandra.
# This is a base directory for Spark Worker work files.
if [ "x$SPARK_WORKER_DIR" = "x" ]; then
export SPARK_WORKER_DIR="/data/spark/worker"
fi
if [ "x$SPARK_LOCAL_DIRS" = "x" ]; then
export SPARK_LOCAL_DIRS="/data/spark/rdd"
fi
# This is a base directory for Spark Worker logs.
if [ "x$SPARK_WORKER_LOG_DIR" = "x" ]; then
export SPARK_WORKER_LOG_DIR="/logs/spark/worker"
fi
# This is a base directory for Spark Master logs.
if [ "x$SPARK_MASTER_LOG_DIR" = "x" ]; then
export SPARK_MASTER_LOG_DIR="/logs/spark/master"
fi
This video shows fully functional DSE Enterprise running on docker: https://vimeo.com/181393134
Upvotes: 2
Reputation: 458
I added chown -RHh cassandra:cassandra /opt/dse
in the setup_node()
portion of DSE Startup Script (Called by Docker container on startup) and it fixed the issue. Check out chown --help
for more info on those options.
NOTE: I'm now getting a
ERROR 04:15:04,789 SPARK-WORKER Logging.scala:74 - Failed to create work directory /var/lib/spark/worker
later on, but at least my fix will get you past your initial issue.
setup_node() {
printf "* Setting up node...\n"
printf " + Setting up node...\n"
create_dirs
tweak_cassandra_config "$DSE_HOME/resources/cassandra/conf"
tweak_dse_in_sh "$DSE_HOME/bin"
tweak_spark_config "$DSE_HOME/resources/spark/conf"
tweak_agent_config
tweak_dse_config "$DSE_HOME/resources/dse/conf"
chown -R cassandra:cassandra /data /logs /conf
chown -RHh cassandra:cassandra /opt/dse
# mark that we tweaked configs
touch "$DSE_HOME/tweaked_configs"
printf "Done.\n"
}
Upvotes: 1