steve
steve

Reputation: 315

Google Cloud Data Catalog - Offerings and Flexibility

Planning to build a data platform with compute as Google Cloud Dataproc storing the data in delta tables (Deltalake).

Currently exploring the data catalog available in GCP stack along with open source Hive meta store and would like to clarify below questions:

Upvotes: 1

Views: 168

Answers (1)

George Verghese
George Verghese

Reputation: 98

Difference between catalog and Dataproc metastore:

If we migrate the application from GCP to other spark platforms (for ex: Databricks and any other), can we port/reuse the GCP data catalog/dataproc metastore already craeted?

  • You should be able to ideally use the Dataproc metastore

Where is the data catalog/dataproc metastore metadata stored? Is this GCS or any other storage?

  • Both are Google proprietary native services - you would need to export out the metadata from DPMS / Google cloud catalog.

Does data catalog/dataproc metastore automatically captures metadata for delta tables on Google platform?

  • No

Upvotes: 0

Related Questions