Jagadish Talluri
Jagadish Talluri

Reputation: 688

How the data is moved or reflected between Hive and Hbase in Hive-HBase-Integration.?

As per my understanding both HIVE and HBASE are using HDFS to store the data. When we integrate HIVE and HBASE ----

How the data is moved between them? Or is it like the data wont move and it simply reflects? I am interested to know in 2 scenarios.

One: Table_1 has data and its in HIVE, Table_2 has data and its in HBASE. Now integration happened (whether this scenario possible?).

How the data movement happens? Is it from HBASE to HIVE or HIVE to HBASE.

Two: Setup as scenario One. Now for newly inserted records. Where would they go?

I am new to HBASE and interested in understanding the data movement in detail with and example.

Please improve the question if needed. Thanks in advance.

Upvotes: 2

Views: 1117

Answers (1)

Vidya
Vidya

Reputation: 30300

HDFS is a distributed file system that is well suited for the storage of large files but does not provide fast individual record lookups.

Hive is simply a SQL-like abstraction for interacting with the data in HDFS.

HBase is also built on top of HDFS. It provides fast reads and writes for large tables. HBase accomplishes this by storing your data in indexed "StoreFiles" that exist on HDFS for high-speed lookups.

So in both cases, data reside in HDFS. That's "where they go."

As for the details of how they work, that's a huge topic where you have to familiarize yourself with such topics as the Hive metastore and storage handlers and the HBase API. I believe this tutorial (Part 1 and Part 2) can help you.

Upvotes: 2

Related Questions