Reputation: 315
Planning to build a data platform with compute as Google Cloud Dataproc storing the data in delta tables (Deltalake).
Currently exploring the data catalog available in GCP stack along with open source Hive meta store and would like to clarify below questions:
Upvotes: 1
Views: 168
Reputation: 98
Difference between catalog and Dataproc metastore:
If we migrate the application from GCP to other spark platforms (for ex: Databricks and any other), can we port/reuse the GCP data catalog/dataproc metastore already craeted?
Where is the data catalog/dataproc metastore metadata stored? Is this GCS or any other storage?
Does data catalog/dataproc metastore automatically captures metadata for delta tables on Google platform?
Upvotes: 0