la_femme_it
la_femme_it

Reputation: 672

Using pyhive with kerberos ticket to connect to kerberized hadoop cluster

I would like to connect to Hive on our kerberized Hadoop cluster and then run some hql queries (obviously haha :)) from machine, which already has its own Kerberose Client and it works, keytab has been passed and tested.

Our Hadoop runs HWS 3.1 and CentOS7, my machine als runs CentOS7 I'm using Python 3.7.3 and PyHive (0.6.1).

I have installed bunch of libraries (and I also tried to uninstall them), as I was going through different forums (HWS, Cloudera, here SO...)

I installed through pip sasl libraries

I installed through yum

Below lies my connection to the hive

return hive.Connection(host=self.host, port=self.port,
       database=self.database, auth=self.__auth,
       kerberos_service_name=self.__kerberos_service_name)

This is part of my yaml

hive_interni_hdp: 
    db_type: hive 
    host: domain.xx.lan 
    database: database_name 
    user: user_name 
    port: 10000 
    auth: KERBEROS 
    kerberos_service_name: hive

When I try to run the code, I'm getting following error.

  File "/opt/Python3.7.3/lib/python3.7/site-packages/dfpy/location.py", line 1647, in conn
    self.__conn = self._create_connection()
  File "/opt/Python3.7.3/lib/python3.7/site-packages/dfpy/location.py", line 1633, in _create_connection
    kerberos_service_name=self.__kerberos_service_name)
  File "/opt/Python3.7.3/lib/python3.7/site-packages/pyhive/hive.py", line 192, in __init__
    self._transport.open()
  File "/opt/Python3.7.3/lib/python3.7/site-packages/thrift_sasl/__init__.py", line 79, in open
    message=("Could not start SASL: %s" % self.sasl.getError()))
thrift.transport.TTransport.TTransportException: Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found'

Did anyone had luck? Where is the obstacle? Pyhive libs, wrong Kerberos connection settings?

Upvotes: 2

Views: 3702

Answers (1)

la_femme_it
la_femme_it

Reputation: 672

I found an solution, I checked out this documentation https://www.cyrusimap.org/sasl/sasl/sysadmin.html

where is GSSAPI mentioned (with Kerberos 5, which I'm using) and I have checked, that I have no support for gssapi on my machine using

sasl2-shared-mechlist

It stated

GSS-SPNEGO,LOGIN,PLAIN,ANONYMOUS

but after installing gssapi library

yum install cyrus-sasl-gssapi

mechlist states

GSS-SPNEGO,GSSAPI,LOGIN,PLAIN,ANONYMOUS

Than I run the code again and Hooray!

P.S. Don't forget to autentificate and verify your keytab is valid

kinit -kt /root/user.keytab [email protected]
klist

Upvotes: 2

Related Questions