Fray
Fray

Reputation: 183

Flink Table and Hive Catalog storage

I have a kafka topic and a Hive Metastore. I want to join the incomming events from the kafka topic with records of the metastore. I saw the possibility with Flink to use a catalog to query Hive Metastore. So I see two ways to handle this:

My biggest concerns are storage related. In both cases, what is stored in memory and what is not ? Does the Hive catalog stores anything on the Flink's cluster side ? In the second case, how the table is handle ? Does flink create a copy ?

Which solution seems the best ? (maybe both or neither are good choices)

Upvotes: 1

Views: 565

Answers (1)

liliwei
liliwei

Reputation: 344

Different methods are suitable for different scenarios, sometimes depending on whether your hive table is a static table or a dynamic table.

If your hive is only a dimension table, you can try this chapter.

joins-in-continuous-queries

It will automatically associate the latest partition of hive, and it is suitable for scenarios where dimension data is slowly updated.

But you need to note that this feature is not supported by the Legacy planner.

Upvotes: 0

Related Questions