Reputation: 13
Technology Versions used:
Hadoop 2.6.0-cdh5.13.1
SPARK2-2.2.0.cloudera2-1.cdh5.12.0.p0.232957
Accumulo 1.8
Hadoop cluster is Kerberized.
I run a Spark application that tries to read data from an Accummulo database within Spark Executor (not Spark Driver).
I pass all the required configuration files like jaas, core-site.xml etc., required for keytab based authentication. The login goes successful and I see that in logs. However, it errors out because UserGroupInformation.getCurrentUser().hasKerberosCredentials() is returned to be false.
Code Snippets:
Below is how login and try to return the Kerberos Token from within Spark Executor (not driver):
@Override
public AuthenticationToken getToken() throws AuthenticationException {
UserGroupInformation.loginUserFromKeytab("principal", "keyTab");
context.login();
context.commit();
return new KerberosToken();
}
Code snippet from org.apache.accumulo.core.client.security.tokens.KerbererosToken:
public KerberosToken() throws IOException {
this(UserGroupInformation.getCurrentUser().getUserName());
}
public KerberosToken(String principal) throws IOException {
requireNonNull(principal);
final UserGroupInformation ugi = UserGroupInformation.getCurrentUser();
checkArgument(ugi.hasKerberosCredentials(), "Subject is not logged in via Kerberos");
checkArgument(principal.equals(ugi.getUserName()), "Provided principal does not match currently logged-in user");
this.principal = ugi.getUserName();
}
ugi.hasKerberosCredentials() call in the above method returns false. So, it fails by given "Subject is not logged in via Kerberos" exception message.
Below is code snippet from org.apache.hadoop.security.UserGroupInformation:
private UserGroupInformation(Subject subject, boolean isLoginExternal)
{
this.subject = subject;
this.user = ((User)subject.getPrincipals(User.class).iterator().next());
this.isKeytab = (!subject.getPrivateCredentials(KeyTab.class).isEmpty());
this.isKrbTkt = (!subject.getPrivateCredentials(KerberosTicket.class).isEmpty());
this.isLoginExternal = isLoginExternal;
}
public boolean hasKerberosCredentials()
{
return (this.isKeytab) || (this.isKrbTkt);
}
jaas.conf
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
doNotPrompt=true
useTicketCache=false
keyTab="xyz.keytab"
principal="[email protected]"
debug = true;
};
How I submit the spark application?
Launch python3.4 shell, and configure PYTHONSTARTUP environment variable to point to below script (pyspark-init.py):
conf = SparkConf().setAppName("appname").setMaster("yarn").set("spark.submit.deployMode", "client").set("spark.driver.extraClassPath", "core-site.xml,hdfs-site.xml,app.properties").set("spark.driver.extraJavaOptions", "-Djava.security.krb5.conf=krb5.conf -Djava.security.auth.login.config=jaas.conf").set("spark.jars", "/opt/cloudera/parcels/SPARK2/lib/spark2/jars/app-spark-0.1.jar").set("spark.yarn.dist.files", "app.properties,core-site.xml,hdfs-site.xml,krb5.conf,jaas.conf,client.conf,xyz.keytab,log4j.properties").set("spark.yarn.keytab", "xyzExec.keytab").set("spark.yarn.principal", "[email protected]").set("spark.executor.extraJavaOptions", "-Djava.security.krb5.conf=krb5.conf -Djava.security.auth.login.config=jaas.conf").set("spark.executor.extraClassPath", "app.properties,core-site.xml,hdfs-site.xml,krb5.conf,jaas.conf,client.conf,xyz.keytab")
sqlCtx = SQLContext(sc)
After that, i will issue the below commands:
df = sqlCtx.read.format("...").load()
df.show()
It fails at df.show() command throwing the below exception:
Caused by: org.apache.accumulo.core.client.AccumuloException: java.lang.IllegalArgumentException: Subject is not logged in via Kerberos
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
at org.apache.accumulo.core.client.security.tokens.KerberosToken.<init>(KerberosToken.java:56)
at org.apache.accumulo.core.client.security.tokens.KerberosToken.<init>(KerberosToken.java:110)
Can you please guide where am I going wrong?
Upvotes: 0
Views: 809