user1888243
user1888243

Reputation: 2691

Hortonworks: How to Manage Users

I am new to Hadoop management and Hortonworks Hadoop. My question is what is the common practice of managing users in Hortonworks. Ambari allows me to create users, but how do companies map users in Ambari to their users. I see that in Hortonworks, I can enable Kerberos; is this the way to allow company users, for example in LDAP, to use the same username/password to login to Hortonworks? I'm not looking for details here, but just some guidance as to what the common practice is.

Upvotes: 1

Views: 533

Answers (2)

Cloudkollektiv
Cloudkollektiv

Reputation: 14749

@facha gives a nice explanation.

Since I work with LDAP and Hortonworks I can only comment on this combination. To start figuring some things out you can for example use LDAP (called demo LDAP) that comes with the standard installation of Hortonworks. You can use the pre-supplied LDAP mappings in Ambari to add more users.

Afterwards you can import these users in Ranger for example to set new policies for the different Hadoop services. This is done with "ranger user sync", which is different from the access to Ambari with ldap users (ambari-server sync-ldap). I was not aware of this difference in the beginning, so it is good to notice.

If you have done all this you can also add Kerberos security, but this is something a lot more difficult to understand (keytabs and principals etc.).

Here is some good information and a nice tutorial on working with LDAP.

If you want to easily manage LDAP users and groups, I would recommend ApacheDirectoryStudio.

Upvotes: -1

facha
facha

Reputation: 12522

An identity source is needed. AD is quite common to be used for that purpose. You'd use something like sssd to integrate AD with your cluster nodes. Once that is done, you can integrate your cluster with AD's kerberos. Finally, you'd use AD's LDAP as a source of authentication for Ambari.

Of course, neither of those things is required. You could as well maintain various identity sources and sync periodically between them (e.g. OS users in /etc/shadow, kerberos users in MIT KDCs database, Ambari users in relational database, etc). Just take into account extra time/effort that will be needed to manage cluster users.

Upvotes: 1

Related Questions