Isac Casapu
Isac Casapu

Reputation: 1301

Permanent Kerberos tickets for interactive users of Hadoop cluster

I have a Hadoop cluster which uses the company's Active Directory as Kerberos realm. The nodes, and the end-user Linux workstations are all Ubuntu 16.04. They are joined to the same domain using PowerBroker PBIS, so SSH logons between the workstations and the grid nodes are single sign-on. End-users run long-running scripts from their workstations, which repeatedly use SSH to first launch Spark / Yarn jobs on the cluster, and then keep track of their progress, which have to keep running overnight and on weekends well beyond the 10-hour lifetime of a Kerberos ticket.

I'm looking for a way to install permanent, service-style, Kerberos keytabs for the users, relieving them of the need to deal with kinit. I understand this would imply anyone with shell access to the grid as a particular user would be able to authenticate as that user.

I've also noticed that performing non-SSO SSH logins using password automatically creates net ticket valid from the time of the login. If this behaviour could be enabled for SSO logins, that would solve my problem.

Upvotes: 1

Views: 1826

Answers (3)

Isac Casapu
Isac Casapu

Reputation: 1301

I took the suggestion above to use the --keytab argument to specify a custom keytab on the grid node from which I submit to Spark. I create my own per-user keytab using the script below. It holds until the user changes password.

Note that the script makes the simplifying assumptions that the Kerberos realm is same as the DNS domain and the LDAP directory where users are defined. This holds for my setup, use with care on yours. It also expects the users to be sudoers on that grid node. A more refined script might separate keytab generation and installation.

#!/usr/bin/python2.7

from __future__ import print_function

import os
import sys
import stat
import getpass
import subprocess
import collections
import socket
import tempfile

def runSudo(cmd, pw):
    try:
        subprocess.check_call("echo '{}' | sudo -S -p '' {}".format(pw, cmd), shell = True)
        return True
    except subprocess.CalledProcessError:
        return False

def testPassword(pw):
    subprocess.check_call("sudo -k", shell = True)
    if not runSudo("true", pw):
        print("Incorrect password for user {}".format(getpass.getuser()), file = sys.stderr)
        sys.exit(os.EX_NOINPUT)    

class KeytabFile(object):
    def __init__(self, pw):
        self.userName = getpass.getuser()
        self.pw = pw
        self.targetPath = "/etc/security/keytabs/{}.headless.keytab".format(self.userName)
        self.tempFile = None

    KeytabEntry = collections.namedtuple("KeytabEntry", ("kvno", "principal", "encryption"))

    def LoadExistingKeytab(self):
        if not os.access(self.targetPath, os.R_OK):

            # Note: the assumption made here, that the Kerberos realm is same as the DNS domain,
            # may not hold in other setups
            domainName = ".".join(socket.getfqdn().split(".")[1:])

            encryptions = ("aes128-cts-hmac-sha1-96", "arcfour-hmac", "aes256-cts-hmac-sha1-96")
            return [
                self.KeytabEntry(0, "@".join( (self.userName, domainName)), encryption)
                    for encryption in encryptions ]

        def parseLine(keytabLine):
            tokens = keytabLine.strip().split(" ")
            return self.KeytabEntry(int(tokens[0]), tokens[1], tokens[2].strip("()"))

        cmd ="klist -ek {} | tail -n+4".format(self.targetPath)
        entryLines = subprocess.check_output(cmd, shell = True).splitlines()
        return map(parseLine, entryLines)

    class KtUtil(subprocess.Popen):
        def __init__(self):
            subprocess.Popen.__init__(self, "ktutil",
                stdin = subprocess.PIPE, stdout = subprocess.PIPE, stderr=subprocess.PIPE, shell = True)

        def SendLine(self, line, expectPrompt = True):
            self.stdin.write(bytes(line + "\n"))
            self.stdin.flush()
            if expectPrompt:
                self.stdout.readline()

        def Quit(self):
            self.SendLine("quit", False)
            rc = self.wait()
            if rc != 0:
                raise subprocess.CalledProcessError(rc, "ktutil")


    def InstallUpdatedKeytab(self):
        fd, tempKt = tempfile.mkstemp(suffix = ".keytab")
        os.close(fd)
        entries = self.LoadExistingKeytab()
        ktutil = self.KtUtil()
        for entry in entries:
            cmd = "add_entry -password -p {} -k {} -e {}".format(
                entry.principal, entry.kvno + 1, entry.encryption)

            ktutil.SendLine(cmd)
            ktutil.SendLine(self.pw)

        os.unlink(tempKt)
        ktutil.SendLine("write_kt {}".format(tempKt))
        ktutil.Quit()

        if not runSudo("mv {} {}".format(tempKt, self.targetPath), self.pw):
            os.unlink(tempKt)
            print("Failed to install the keytab to {}.".format(self.targetPath), file = sys.stderr)
            sys.exit(os.EX_CANTCREAT)

        os.chmod(self.targetPath, stat.S_IRUSR)
        # TODO: Also change group to 'hadoop'

if __name__ == '__main__':

    def main():
        userPass = getpass.getpass("Please enter your password: ")
        testPassword(userPass)
        kt = KeytabFile(userPass)
        kt.InstallUpdatedKeytab()

    main()

Upvotes: 0

Kishore
Kishore

Reputation: 5891

If you are accessing Hive/Hbase or any other components with need kerberos ticket then make your spark code to relogin in case of ticket expired. You have to update ticket to use keytab rather than relying on a TGT to already exist in the cache. This is done by using the UserGroupInformation class from the Hadoop Security package. Add below snippet in you spark job for long running-

val configuration = new Configuration
configuration.addResource("/etc/hadoop/conf/hdfs-site.xml")
UserGroupInformation.setConfiguration(configuration)

UserGroupInformation.getCurrentUser.setAuthenticationMethod(AuthenticationMethod.KERBEROS)
UserGroupInformation.loginUserFromKeytabAndReturnUGI(
  "hadoop.kerberos.principal", " path of hadoop.kerberos.keytab file")
  .doAs(new PrivilegedExceptionAction[Unit]() {
    @Override
    def run(): Unit = {
       //hbase/hive connection
      // logic

    }
  })

Above we specify the name of our service principal and the path to the keytab file we generated. As long as that keytab is valid our program will use the desired service principal for all actions, regardless of whether or not the user running the program has already authenticated and received a TGT.

If there is no other component access except spark then you don't need to write above code. Simply provide keytab and principal in you spark submit command.

spark-submit --master yarn-cluster --keytab "xxxxxx.keytab" --principal "[email protected]"  xxxx.jar

Upvotes: 1

Tagar
Tagar

Reputation: 14939

You just have to ask users to add --principal and --keytab arguments to their Spark jobs. Then Spark (actually YARN) code will renew tickets for you automatically. We have jobs that run for weeks using this approach.

See for example https://spark.apache.org/docs/latest/security.html#yarn-mode

For long-running apps like Spark Streaming apps to be able to write to HDFS, it is possible to pass a principal and keytab to spark-submit via the --principal and --keytab parameters respectively. The keytab passed in will be copied over to the machine running the Application Master via the Hadoop Distributed Cache (securely - if YARN is configured with SSL and HDFS encryption is enabled). The Kerberos login will be periodically renewed using this principal and keytab and the delegation tokens required for HDFS will be generated periodically so the application can continue writing to HDFS.

You can see in Spark driver logs when Yarn renews a Kerberos ticket.

Upvotes: 1

Related Questions