Ashok Khote
Ashok Khote

Reputation: 117

How to access hive database from secured kerberos environment using java


I am using hadoop with kerberos environment and I am new to the kerberos. I wanted to access hive database using java, I gone through the hive official site but they given very generalize information.
Please somebody give me specific answer?

Upvotes: 1

Views: 7371

Answers (3)

chandana.ranasinghe
chandana.ranasinghe

Reputation: 11

Probably you need to setup your env for Kerberos auth and related application configs, Also, please follow the Driver options/settings for Kerberos auth:

  1. Setup kinit(kerberos) service & /etc/krb5.conf as per your env setup.
  2. Required network permission to Kerberos servers & DB servers.
  3. Create jaas.conf file (in a accessible file path for standalone application user or application server user)
  4. Place keytab as per above jaas config file.

If it is a standalone java app, you may just set sys property as follows:

System.setProperty("java.security.auth.login.config","/<above_path>/jaas.conf");

If it is a web application (jaas config can be setup programmatically):

Java code:

    //in getConnection method
    Configuration jaasConfig = createJaasConfig();
    Configuration.setConfiguration(jaasConfig);
    System.setProperty("java.security.auth.login.config", jaasConfig.toString());
    
    
private Configuration createJaasConfig() {

    String keytab = "/<your_key_tab_path>/myuser.keytab";


    // Create entry options.
    ImmutableMap<String, String> options = ImmutableMap.of(
            "com.sun.security.auth.module.Krb5LoginModule", "required",
            "doNotPrompt", "true",
            "useKeyTab", "true",
            "keyTab", "" + keytab,
            "principal", "[email protected]"
    );

    // Create entries.
    final AppConfigurationEntry[] entries = {
        new AppConfigurationEntry(
        Krb5LoginModule.class.getCanonicalName(),
        AppConfigurationEntry.LoginModuleControlFlag.REQUIRED,
        options)
    };

    // Create configuration.
    return new Configuration() {
        @Override
        public AppConfigurationEntry[] getAppConfigurationEntry(String name) {
            return entries;
        }
    };

}

Upvotes: 0

Babu
Babu

Reputation: 5220

Well with Kerberos it goes everything more complicated.

Configuration before any jdbc code:

imports:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.log4j.Logger;

init configuration code:

Configuration systemConf = new Configuration();
        if (isLocalRun()) {
            LOG.info("Running on cluster, using hive-site.xml config");
            systemConf.addResource(new Path("/etc/hadoop/current/hive/hive-site.xml"));
        } else {
            LOG.info("Running from local computer, no hive-site.xml added, using only JDBC");
        }
        systemConf.set("hadoop.security.authentication", "Kerberos");
        UserGroupInformation.setConfiguration(systemConf);
        UserGroupInformation.loginUserFromKeytab(principal, keytabPath);

then you can get connection:

 try (Connection conn = DriverManager.getConnection(conf.hive().getConnectionString())) {
            HiveDatabaseMetaData metadata = (HiveDatabaseMetaData) conn.getMetaData();
            parseDatabase(hiveDatabase, conn, metadata, 
        }

maven dependency for Hive jdbc driver

    <dependency>
        <groupId>org.apache.hive</groupId>
        <artifactId>hive-jdbc</artifactId>
        <version>2.0.0</version>
        <classifier>standalone</classifier>
    </dependency>
  • isLocalRun()= it depends if you run it in your computer or directly on cluster
    • when running on cluster you need to add hive-site.xml (it can be different path than in this example) with all configurations
    • from local use the jdbc connection string for outside cluster connections

Upvotes: 1

Vaijnath Polsane
Vaijnath Polsane

Reputation: 657

I think kerbrose implementation is pretty much vast concept and going through it for a small task could be time consuming.
Here is quick fix to your problem!
To access the hive and in secure environment do consider the following things:

- To access the hive you need to provide all jars specific to that hive version,as given a list on hive official site.
- Next Provide hive version specific driver name e.g. for hive server2 "org.apache.hive.jdbc.HiveDriver"
- Provide hive connection URL e.g. jdbc:hive2://node.addr:10000/default;principal=hive/[email protected]
We provide connection address and the security authentication in connection URL. For kerberos there will be the authentication string and that is the kerberos principle that we have set while kerberos implementation.
This string is same as the string we provide while connecting to hive server using beeline e.g. beeline -u "jdbc:hive2://node.addr:10000/default;principal=hive/[email protected]"
Here is a small code:

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
import java.util.Date;

public class Connect {

    private static ResultSet res;

       public static void main(String[] args) throws Exception {
              Class.forName("org.apache.hive.jdbc.HiveDriver");
              System.out.println("Process started at:"+new Date());
              Connection con = DriverManager.getConnection("jdbc:hive2://node.addr:10000/default;principal=hive/[email protected]");

              Statement stmt = con.createStatement();
              stmt.execute("create table testTable (key string,col1 string)");
              System.out.println("Table Created successfully");
              con.close();
          }
    }

Upvotes: 4

Related Questions