Reputation: 2317
This is probably a very basic question so please pardon the ignorance.
I understand there are two metastores that hive will use in a out of the box (hive tar.bin extract) vanilla setup. In my case I have hive 0.14.
There is one in a derby database--with a default folder name called metastore_db
outside of hdfs.
And there is another in hdfs at /user/hive/warehouse.
What is the e difference between these two?
Upvotes: 3
Views: 5543
Reputation: 626
In Hive, Metastore constitutes of (1) the meta store service and (2) the database.
Metastore DB - is any JDBC complaint RDBMS database, in which it stores schema and partition details for both managed and external tables. This can be used by other applications such as Impala, to get tables and schema details from it. As name suggests, it only stores meta data.
Metastore Service - Hive also runs a separate service called metastore service to manage the metastore data like, stores the metadata for Hive tables and partitions in a Metastore DB, and provides clients (including Hive) access to this information via the metastore service API.
Warehouse - Hive data is stored in HDFS, normally under /user/hive/warehouse (or any path you specify as hive.metastore.warehouse.dir in your hive-site.xml ).
Upvotes: 5
Reputation: 249
Metastore is where hive store schema of tables, and more data how directory which reference data for schema table in warehouse.
Warehouse commonly store in HDFS, metastore in relational database like Derby, MySQL or Postgre.
Metastore usually is used for many other applications like impala for discover tables in warehouse.
Upvotes: 1