spark executor fails to communicate with accumulo via keytab-based authentication in kerberized hadoop cluster

Question

Technology Versions used:

Hadoop 2.6.0-cdh5.13.1
SPARK2-2.2.0.cloudera2-1.cdh5.12.0.p0.232957
Accumulo 1.8

Hadoop cluster is Kerberized.
I run a Spark application that tries to read data from an Accummulo database within Spark Executor (not Spark Driver).
I pass all the required configuration files like jaas, core-site.xml etc., required for keytab based authentication. The login goes successful and I see that in logs. However, it errors out because UserGroupInformation.getCurrentUser().hasKerberosCredentials() is returned to be false.
Code Snippets:

Below is how login and try to return the Kerberos Token from within Spark Executor (not driver):

@Override
public AuthenticationToken getToken() throws AuthenticationException {
    UserGroupInformation.loginUserFromKeytab("principal", "keyTab");
    context.login();
    context.commit();
    return  new KerberosToken(); 
}

Code snippet from org.apache.accumulo.core.client.security.tokens.KerbererosToken:

   public KerberosToken() throws IOException {
     this(UserGroupInformation.getCurrentUser().getUserName());
   }

   public KerberosToken(String principal) throws IOException {
     requireNonNull(principal);
     final UserGroupInformation ugi = UserGroupInformation.getCurrentUser();
     checkArgument(ugi.hasKerberosCredentials(), "Subject is not logged in via Kerberos");
     checkArgument(principal.equals(ugi.getUserName()), "Provided principal does not match currently logged-in user");
     this.principal = ugi.getUserName();
 }

ugi.hasKerberosCredentials() call in the above method returns false. So, it fails by given "Subject is not logged in via Kerberos" exception message.

Below is code snippet from org.apache.hadoop.security.UserGroupInformation:

  private UserGroupInformation(Subject subject, boolean isLoginExternal)
  {
this.subject = subject;
this.user = ((User)subject.getPrincipals(User.class).iterator().next());
this.isKeytab = (!subject.getPrivateCredentials(KeyTab.class).isEmpty());
this.isKrbTkt = (!subject.getPrivateCredentials(KerberosTicket.class).isEmpty());
this.isLoginExternal = isLoginExternal;
 }

public boolean hasKerberosCredentials()
{
   return (this.isKeytab) || (this.isKrbTkt);
}

jaas.conf

Client {
  com.sun.security.auth.module.Krb5LoginModule required
  useKeyTab=true
  storeKey=true
  doNotPrompt=true
  useTicketCache=false
  keyTab="xyz.keytab"
  principal="xyz@xyz.io"
  debug = true;
};

How I submit the spark application?

Launch python3.4 shell, and configure PYTHONSTARTUP environment variable to point to below script (pyspark-init.py):

conf = SparkConf().setAppName("appname").setMaster("yarn").set("spark.submit.deployMode", "client").set("spark.driver.extraClassPath", "core-site.xml,hdfs-site.xml,app.properties").set("spark.driver.extraJavaOptions", "-Djava.security.krb5.conf=krb5.conf -Djava.security.auth.login.config=jaas.conf").set("spark.jars", "/opt/cloudera/parcels/SPARK2/lib/spark2/jars/app-spark-0.1.jar").set("spark.yarn.dist.files", "app.properties,core-site.xml,hdfs-site.xml,krb5.conf,jaas.conf,client.conf,xyz.keytab,log4j.properties").set("spark.yarn.keytab", "xyzExec.keytab").set("spark.yarn.principal", "xyz@xyz.io").set("spark.executor.extraJavaOptions", "-Djava.security.krb5.conf=krb5.conf -Djava.security.auth.login.config=jaas.conf").set("spark.executor.extraClassPath", "app.properties,core-site.xml,hdfs-site.xml,krb5.conf,jaas.conf,client.conf,xyz.keytab")


sqlCtx = SQLContext(sc)

After that, i will issue the below commands:

df = sqlCtx.read.format("...").load()

df.show()

It fails at df.show() command throwing the below exception:

Caused by: org.apache.accumulo.core.client.AccumuloException: java.lang.IllegalArgumentException: Subject is not logged in via Kerberos
    at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
    at org.apache.accumulo.core.client.security.tokens.KerberosToken.(KerberosToken.java:56)
    at org.apache.accumulo.core.client.security.tokens.KerberosToken.(KerberosToken.java:110)

Can you please guide where am I going wrong?

spark executor fails to communicate with accumulo via keytab-based authentication in kerberized hadoop cluster

Answers (0)

Related Questions