Reputation: 756
Need to access Hive metadata tables using Jdbc program.What exactly does the Metastore actually store and how can I access it?
I tried doing this:
sql="show tables";
Statement stmt = con.createStatement();
System.out.println("Running: " + sql);
res = stmt.executeQuery(sql);
while (res.next()) {
System.out.println(res.getString(1));
}
So I get the list of tables, but I want to know which table is this information stored in so that I can Directly Select from that table instead of firing a Hive command.
Also my Metastore is configured in PostGreSQL.(if that helps!) Thanks Hitz
Upvotes: 3
Views: 12464
Reputation: 10650
You can query the metastore DB through JDBC.
E.g: list table names and their location on HDFS:
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hadoop.hive.conf.HiveConf.ConfVars;
public class HiveMetastoreJDBCTest {
public static void main(String[] args) throws Exception {
Connection conn = null;
try {
HiveConf conf = new HiveConf();
conf.addResource(new Path("file:///path/to/hive-site.xml"));
Class.forName(conf.getVar(ConfVars.METASTORE_CONNECTION_DRIVER));
conn = DriverManager.getConnection(
conf.getVar(ConfVars.METASTORECONNECTURLKEY),
conf.getVar(ConfVars.METASTORE_CONNECTION_USER_NAME),
conf.getVar(ConfVars.METASTOREPWD));
Statement st = conn.createStatement();
ResultSet rs = st.executeQuery(
"select t.tbl_name, s.location from tbls t " +
"join sds s on t.sd_id = s.sd_id");
while (rs.next()) {
System.out.println(rs.getString(1) + " : " + rs.getString(2));
}
}
finally {
if (conn != null) {
conn.close();
}
}
}
}
There exists an ER diagram about the metastore, but it might not be up-to-date, therefore I'd suggest you to run the metastore DDL script (Hive 0.12), in a test schema, and create the new ER diagram from these tables. (E.g with PowerArchitect)
Upvotes: 3
Reputation: 11514
Metastore, well, stores meta information on your Hive tables - names, partitions, columns, SSNs, SerDes, etc.
Metastore connection parameters are stored in hive-site.xml
(relevant properties are named javax.jdo.option.ConnectionURL, javax.jdo.option.ConnectionUserName, and javax.jdo.option.ConnectionPassword)
You can connect to the Postgres database using the connection parameters and run, say, these commands:
-- that'll get you the names of all Hive tables
SELECT tbl_name FROM TBLS;
-- that'll list all Metastore tables
\d
Upvotes: 0