Reputation: 691
I am trying to install krb5-user and sssd-krb5 via init script on Databricks all-purpose compute with 15.4 LTS (includes Apache Spark 3.5.0, Scala 2.12) runtime version.
The cluster is failing to start and here is what I get in return:
Spark startup failure: Spark was not able to start in time. This issue can be caused by a malfunctioning Hive metastore, invalid Spark configurations, or malfunctioning init scripts. Please refer to the Spark driver logs to troubleshoot this issue, and contact Databricks if the problem persists.
Internal error message: Spark failed to start: INTERNAL_ERROR: Starting worker failed. Failed to run start slave command in container. command: bash ${DB_HOME:-/home/ubuntu/databricks}/spark/scripts/start_spark_slave.sh
10.139.64.11 7077 10.139.64.10 40000 4 stdout: stderr: lxc-attach: 0117-102328-c55l7skk_1e5bc43bbceb40c98680e6fccbe6f304: attach.c: get_attach_context: 405 Connection refused - Failed to get init pid lxc-attach: 0117-102328-c55l7skk_1e5bc43bbceb40c98680e6fccbe6f304: attach.c: lxc_attach: 1469 Connection refused - Failed to get attach context
Spark driver logs do not provide anything meaningful:
appcds_setup elapsed time: 0.000
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
ANTLR Tool version 4.8 used for code generation does not match the current runtime version 4.9.3
ANTLR Tool version 4.8 used for code generation does not match the current runtime version 4.9.3
ANTLR Tool version 4.8 used for code generation does not match the current runtime version 4.9.3
ANTLR Tool version 4.8 used for code generation does not match the current runtime version 4.9.3
chown: invalid group: ‘:spark-users’
Fri Jan 31 18:34:47 2025 Connection to spark from PID 1376
Fri Jan 31 18:34:47 2025 Initialized gateway on port 44977
Fri Jan 31 18:34:47 2025 Connected to spark.
Init script logs do not have any errors but it seems that kerberos libraries installation just stops, here are few last lines:
Setting up systemd (256.5-2ubuntu3) ...
Installing new version of config file /etc/systemd/journald.conf ...
Installing new version of config file /etc/systemd/logind.conf ...
Installing new version of config file /etc/systemd/networkd.conf ...
Installing new version of config file /etc/systemd/pstore.conf ...
Installing new version of config file /etc/systemd/sleep.conf ...
Installing new version of config file /etc/systemd/system.conf ...
Installing new version of config file /etc/systemd/user.conf ...
/usr/lib/tmpfiles.d/legacy.conf:13: Duplicate line for path "/run/lock", ignoring.
Created symlink '/run/systemd/system/tmp.mount' → '/dev/null'.
/usr/lib/tmpfiles.d/legacy.conf:13: Duplicate line for path "/run/lock", ignoring.
Removing obsolete conffile /etc/systemd/resolved.conf ...
And here is my init script:
#!/bin/bash
export DEBIAN_FRONTEND=noninteractive
echo "deb http://cz.archive.ubuntu.com/ubuntu oracular main universe" | sudo tee -a /etc/apt/sources.list
sudo apt-get update
sudo apt-get -y install krb5-user
sudo apt-get -y install sssd-krb5
cp /Volumes/main/default/configuration_volume/jaas.config .
Is there a way to install kerberos client libraries or perhaps some other options to run spark streaming jobs consuming the data from kerberised Kafka?
Thanks.
Upvotes: 0
Views: 45