Reputation: 71
I'm trying to run a spark job in cluster mode & it is throwing following exception :-
java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
but my job runs fine in client mode. Can anyone suggest some solution ?
Upvotes: 1
Views: 6806
Reputation: 18475
This sounds like your Kerberos credentials are not getting distributed to the worker nodes.
We faced the same issue after upgrading Spark version from 1.x and 2.x figuring out that with the newer versions, Spark itself takes care of the distribution of the keytabs. For that we had to provide the parameters --principal
and --keytab
via the spark-submit
command.
There is a nice section on Kerberos available in the Spark documentation. In addition, it says something about Long-running Applications.
Spark supports automatically creating new tokens for these applications when running in YARN mode. Kerberos credentials need to be provided to the Spark application via the
spark-submit
command, using the--principal
and--keytab
parameters.The provided keytab will be copied over to the machine running the Application Master via the Hadoop Distributed Cache. For this reason, it’s strongly recommended that both YARN and HDFS be secured with encryption, at least.
The Kerberos login will be periodically renewed using the provided credentials, and new delegation tokens for supported will be created.
Upvotes: 1